DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Trending

  • Integrating AWS With Salesforce Using Terraform
  • Revolutionizing Algorithmic Trading: The Power of Reinforcement Learning
  • Scaling Site Reliability Engineering (SRE) Teams the Right Way
  • Automating the Migration From JS to TS for the ZK Framework

Trending

  • Integrating AWS With Salesforce Using Terraform
  • Revolutionizing Algorithmic Trading: The Power of Reinforcement Learning
  • Scaling Site Reliability Engineering (SRE) Teams the Right Way
  • Automating the Migration From JS to TS for the ZK Framework
  1. DZone
  2. Data Engineering
  3. Data
  4. How to Adopt A Federal Government Dataset

How to Adopt A Federal Government Dataset

Kin Lane user avatar by
Kin Lane
·
Jan. 22, 14 · Interview
Like (0)
Save
Tweet
Share
3.63K Views

Join the DZone community and get the full member experience.

Join For Free

 

When I pulled the over 5,000 datasets from 22 federal agencies after the implementation of OMB Memorandum M-13-13 Open Data Policy-Managing Information as an Asset, I was reminded how much work there is still to do around opening up government data. Overall I gave the efforts a C grade, because it seemed like agencies just rounded up a bunch of data laying around, published to meet the deadline, without much regard for the actual use or consumption of the data.

Even with these thoughts, I know how hard it was to round up this 5,000 datasets in government, and because of that, I can't get these datasets out of my mind. I want to go through all of them, clean up and share them back with each agency. Obvioiusly I can't do that, but what is the next best thing? As I was walking the other day, I thought it would be good to have a sort of adoption tool, so that anyone can step up and adopt a federal agency's dataset and improve it.

AS with most of my projects, it is hard to get them out of my head until I get at least a proof of concept up and running. So I developed Federal Agency Dataset Adoption, and published it to Github. I published a JSON listing of the 22 federal agencies, and the data.json file that each agency published. Next I created a simple UI to browse the agencies, datasets, view details, distributions with the ability to "adopt" a dataset. 

When you choose to adopt a federal agency dataset, the site authenticates with your Github account using oAuth, then creates a repository to house any work that will occur. Each dataset you adopt gets its own branch within the repository, a README file, and a copy of the datasets entry from it's agency's data.json file. 

I would copy actual datasets to the repo, but many of the references are just HTML or ASP pages, and you have to manually look up the data. Each repo is meant to be a workspace, and users who adopt datasets can just update the data.json to point to any new data distributions that are generated. I will programmatically pull these updates and register with the master site on a regular basis.

The system is meant to help me track which datasets I'm working on, and if other people want to get involved, awesome! I envision it acting as a sort of distributed directory, in which agencies, and consumers of agency's data, can find alternate versions of federal government data. Additionally, data obsessed folks like me can clean up data, and contribute back to the lifecycle in a federated way, using our own Github accounts combined with a centralized registry.

As with my other side projects, who knows where this will go. I'd like to get some other folks involved, and maybe get the attention of agencies, then drum up with some funding so I can put some more cycles into it. If you are interested, you can get involved via the Federal Government Dataset Adoption Github repository. 

Data (computing)

Published at DZone with permission of Kin Lane, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Integrating AWS With Salesforce Using Terraform
  • Revolutionizing Algorithmic Trading: The Power of Reinforcement Learning
  • Scaling Site Reliability Engineering (SRE) Teams the Right Way
  • Automating the Migration From JS to TS for the ZK Framework

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: