Using GitHub as an API Index and Data Store
GitHub is far more than a social coding platform, but an index of API and data store — publishing and sharing open data so that it is highly available for free.
Join the DZone community and get the full member experience.Join For Free
I am spending a lot of time studying how companies are using Github as part of their software and API development lifecycle, and how the social coding platform is used. More companies like Netflix are using it as part of their Continuous Integration workflow, something that API service providers like APIMATIC are looking to take advantage of with a new wave of services and tooling. This usage of GitHub goes well beyond just managing code and is making the platform more of an engine in any Continuous Integration and API lifecycle workflow.
I run all my API research project sites on GitHub. I do this because it is secure and static and it introduces a very potent way to not just manage a single website but over 200 individual open data and API projects. Each one of my API research areas leverages a GitHub Jeykll core, providing a machine readable index of the companies, news, tools, and other building blocks I'm aggregating throughout my research.
Recently, this approach has moved beyond the core areas of my API research and is something I'm applying to my API discovery work, profiling the resources available with popular API platforms like Amazon Web Services, and across my government work like with my GSA index. Each of these projects managed using GitHub, providing a machine-readable index of the disparate API in a single
APIs.json index that includes OpenAPI specs for each of the APIs included. When complete, these indexes can provide a runtime discovery engine of APIs used as part of integrations, providing an index of single APIs, as well as potentially across many distributed APiI brought together into a single meaningful collection.
I've started pushing this approach even further with my Knight Foundation-funded Adopta.Agency work and making the GitHub repository not just a machine-readable index of many APIs. I'm also using the
_data folder as a JSON or YAML data store, which can then also be indexed as part of the
APIs.json and OpenAPI Spec for each project. I've been playing with different ways of storing and working with JSON and YAML in Jekyll on Github for a while now, but now I'm trying to develop projects that are a seamless open data store, as well as an API index, providing the best of both worlds.
This is not a model for delivering high performance and availability APIs. This is a model for publishing and sharing open data so that it is highly available, workable, and hosted on GitHub for free. Most of the data I work with is publicly available. It is part of what I believe in and how I work on a regular basis. Making it available in a GitHub repo allows it to be forked or even consumed directly while offloading bandwidth and storage costs to GitHub. The
GET layer for all my open data project is all static, and dead simple to work with. Next, I'm working on a truly RESTfully augmented layer providing the
DELETE, as well as more advanced search solutions.
I am using the Github API for this augmented layer. I am just playing with different ways to proxy it and deliver the best search results possible. The
DELETE layer for each GitHub repository data store in the
_data folder is pretty straightforward. My goal is to offload as much of the payload to GitHub as possible, but then augment what it can't do when it comes to more advanced usage. I'm looking for each API index and data store can act as a forkable engine for a variety of stops along the API lifecycle, as well as throughout the delivery of the web, mobile, and device-based applications we are building on top of them.
Published at DZone with permission of Kin Lane, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.