DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • See What's New in Neo4j 4.0
  • NoSQL for Relational Minds
  • The Beginner's Guide To Understanding Graph Databases
  • Unveiling the Clever Way: Converting XML to Relational Data

Trending

  • A Guide to Developing Large Language Models Part 1: Pretraining
  • How to Build Scalable Mobile Apps With React Native: A Step-by-Step Guide
  • Solid Testing Strategies for Salesforce Releases
  • *You* Can Shape Trend Reports: Join DZone's Software Supply Chain Security Research
  1. DZone
  2. Data Engineering
  3. Databases
  4. Loading US Lobbying Data Into Neo4J

Loading US Lobbying Data Into Neo4J

Let's see how to load US lobbying data into Neo4j.

By 
Scott Sosna user avatar
Scott Sosna
DZone Core CORE ·
Sep. 16, 19 · Tutorial
Likes (5)
Comment
Save
Tweet
Share
8.9K Views

Join the DZone community and get the full member experience.

Join For Free

Image title

Lobbying money

In the United States, the money spent for political reasons is immense and growing, especially since the Supreme Court struck down limits on federal campaign donations. More than a year away, candidates for the 2020 federal elections have already raised more than $800M. An alternative way to influence federal politics is via independent or non-coordinate spending, also known as soft money, where the political expenditures are not coordinate with candidates but advocate for or against candidates or issues. Dark money is political spending by non-profit organizations where donors can remain anonymous.

You might also like: Querying Graphs With Neo4j

Lobbying

An example of political spending occurring outside of election campaigns is lobbying. While the exact definition of lobbying differs — e.g., federal vs. state, state vs. state — in general, lobbying attempts to influence a politician, legislature, or public official to see a specific view on an issue, law, policy, regulation, etc. Lobbying has been present since the founding of the United States and is protected by the First Amendment as freedom of speech and right to petition. In 2018, almost $3.5B was spent lobbying the federal government.

Prior to 1996, there was little transparency about lobbying the federal government, but the Lobbying Disclosure Act of 1995 increased accountability by defining lobbying in law and, more importantly, requires that lobbying efforts are documented through filings made to the US Senate and House of Representatives.

Filings Raw Data

On the Senate site, the filing data is available as zipped XML documents by year and quarter since 1999. A completed XML file contains approximately 1000 filings, though some contain less (likely dependent on their internal system into which filings are uploaded).

<Filing ID="403EFFC2-F7B6-4FB0-AA2F-2584CC25FF3E" Year="2019" Received="2019-04-04T16:39:08.047" Amount="" Type="FIRST QUARTER REPORT" Period="1st Quarter (Jan 1 - Mar 31)">
   <Registrant RegistrantID="1810" RegistrantName="AMERICAN BIRD CONSERVANCY" GeneralDescription="" Address="4301 Connecticut Ave NW #451 Washington, DC 20008" RegistrantCountry="USA" RegistrantPPBCountry="USA" />
   <Client ClientName="AMERICAN BIRD CONSERVANCY" GeneralDescription="" ClientID="12" SelfFiler="TRUE" ContactFullname="STEVE HOLMER" IsStateOrLocalGov="TRUE" ClientCountry="USA" ClientPPBCountry="USA" ClientState="DISTRICT OF COLUMBIA" ClientPPBState="DISTRICT OF COLUMBIA" />
   <Lobbyists>
      <Lobbyist LobbyistName="HOLMER, STEVE" LobbyistCoveredGovPositionIndicator="NOT COVERED" OfficialPosition="" ActivityInformation="A" />
      <Lobbyist LobbyistName="Cipolletti, Jennifer L." LobbyistCoveredGovPositionIndicator="NOT COVERED" OfficialPosition="" ActivityInformation="A" />
   </Lobbyists>
   <GovernmentEntities>
      <GovernmentEntity GovEntityName="Bureau of Land Management (BLM)" />
      <GovernmentEntity GovEntityName="SENATE" />
      <GovernmentEntity GovEntityName="Office of Management &amp; Budget (OMB)" />
      <GovernmentEntity GovEntityName="U.S. Forest Service" />
      <GovernmentEntity GovEntityName="Agriculture, Dept of (USDA)" />
      <GovernmentEntity GovEntityName="U.S. Fish &amp; Wildlife Service (USFWS)" />
      <GovernmentEntity GovEntityName="HOUSE OF REPRESENTATIVES" />
   </GovernmentEntities>
   <Issues>
      <Issue Code="ENVIRONMENT/SUPERFUND" SpecificIssue="Saving Americas Pollinators Act, H.R. 1337,  and H.R.230, The Ban Toxic Pesticides Act of 2019" />
      <Issue Code="AGRICULTURE" SpecificIssue="Farm Bill of 2018 implementation issues, rule-makings, and appropriations" />
      <Issue Code="NATURAL RESOURCES" SpecificIssue="Interior Appropriations" />
      <Issue Code="ANIMALS" SpecificIssue="Bird Safe Buildings Act Albatross and Petrel Conservation Act Interior Appropriations for US Fish and Wildlife Service" />
      <Issue Code="REAL ESTATE/LAND USE/CONSERVATION" SpecificIssue="Greater Sage-Grouse Endangered Species Act Exemption" />
   </Issues>
   <ConvictionDisclosure>
      <Conviction ConvictionReported="NO" />
   </ConvictionDisclosure>
</Filing>

Using relational database terminology, the top-level <Filing> node is an associative entity, which pulls together all the related information about the filing — i.e., unique identifier, period represented, dollar amount spent for effort, date of filing — and the child nodes are the specifics.

  • Client: Special interest groups — e.g., corporations, non-profits, industries, national and international governments — advocating for/against legislation or regulations under consideration by the federal government.
  • Lobbyist: A professional hired by the client to present the client's position and persuade the federal government to take the client's position with regards to proposed legislation and regulations.
  • Registrant: The organization performing lobbying activities on behalf of the client, registered with the US government. Clients may lobby on their own behalf as both client and registrant or may hire firms who specialize in lobbying and hire lobbyists.
  • Government Entity: A department, reguatory agency, commission or branch of government lobbied. Multiple entities are usually associated with a single filing; by far the most lobbied entity is the Senate and House of Representatives.
  • Issue: Filings are assigned to general categories to simplify reporting, e.g., Education, Transportation, Natural Resources. Each filing contains a detailed description of the lobbying effort.

Loading Neo4J

The basic process for loading the filings data is fairly straight-forward:

  1. Unzip the XML documents (files) from the downloaded zip file
  2. Deserialize the XML file
  3. For each filing, find or create the supporting Neo4J nodes — the registrants, lobbyists, clients, issues, and government entities — and then create a new filing node with the appropriate relationships

The documents are well-formed, so using JAXB for deserializing the XML is fairly straight-forward. I manually created an XSD representing the XML data and generated annotated POJOs that could be used for iterating through the filings.

I'm using the Neo4J Object Mapping library (OGM) to load the data into Neo4J. Neo4J OGM is an annotation-driven persistence library, similar in concept to the Java Persistence API (e.g., Hibernate) where objects represent nodes and relationships. Neo4J OGM also supports sessions, transactions and programmatic queries.

After different attempts, here's how I ended up modeling the data:

Model of data

Challenges

  • Schema Definition. (Filing) nodes far outnumber other node types and are the concept around which all other information is related. As I'll demonstrate in a subsequent article, all useful queries include the [FILED] relationship, making visualizations difficult.
  • Performance. The constant querying of Neo4J for existing nodes — clients, registrants, lobbyists, issues, and government entities — slows the load process as more filings are loaded. (Lobbyist) nodes are the next largest by volume, it could be that caching or better database indices are required, definitely an area to investigate.

What's Next?

My next article will demonstrate different ways to query the database for interesting facts.

The project source can be found on GitHub.

Neo4j Data (computing) Relational database Database

Opinions expressed by DZone contributors are their own.

Related

  • See What's New in Neo4j 4.0
  • NoSQL for Relational Minds
  • The Beginner's Guide To Understanding Graph Databases
  • Unveiling the Clever Way: Converting XML to Relational Data

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!