Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Aggregating Multiple GitHub Account RSS Feeds Into Single JSON API Feed

DZone's Guide to

Aggregating Multiple GitHub Account RSS Feeds Into Single JSON API Feed

Using GitHub's RSS feature is a great way to stay up-to-date on what the developer community is up to. Learn how to create an API to keep track of all these feeds!

· Integration Zone ·
Free Resource

The Future of Enterprise Integration: Learn how organizations are re-architecting their integration strategy with data-driven app integration for true digital transformation.

GitHub is the number one signal in my API world. The activity that occurs via GitHub is more important than anything I find across Twitter, Facebook, LinkedIn, and other social channels. Commits to repositories and the other social activity that occurs around coding projects are infinitely more valuable, and telling, regarding what a company is up to than the deliberate social media signals blasted out via other channels. I’m always working to dial in my monitoring of GitHub using the GitHub API, but also via the RSS feeds that are present on the public side of the platform.

I feel RSS is often overlooked as an API data source, but I find that RSS is not only alive and well in 2018, it is something that is actively used on many platforms. The problem with RSS for me is the XML isn’t always conducive to working with in many of my JavaScript enabled applications, and I also tend to want to aggregate, filter, and translate RSS feeds into more meaningful JSON. To help me accomplish this for GitHub, I crafted a simple PHP RSS aggregator and converter script which I can run in a variety of situations. I published the basic script to GitHub as a Gist, for easy reference.

<?php
// function for sorting by date
function date_compare($a, $b)
{
    $t1 = strtotime($a['published']);
    $t2 = strtotime($b['published']);
    return $t2 - $t1;
}    
// add any users I want
$user_array = array('kinlane','apievangelist','streamdataio');
$feed_array = array();
// loop through my users
foreach($user_array as $user)
{

// pull the RSS feeds
$rss_feed = 'https://github.com/' . $user . '.atom';
$github_rss = file_get_contents($rss_feed);
$github_object = simplexml_load_string($github_rss);

foreach($github_object->entry as $entry)
{

// establish local values, in case I want to do something else
$id = $entry->id->__toString();
$published = $entry->published->__toString();
$link = $entry->link["href"]->__toString();
$title = $entry->title->__toString();
$author_name = $entry->author->name->__toString();
$author_url = $entry->author->uri->__toString();
// create an array
$item = array();
$item['id'] = $id;
$item['published'] = $published;
$item['title'] = $title;
$item['link'] = $link;
$item['author_name'] = $author_name;
$item['author_url'] = $author_url;
// add the individual entry
array_push($feed_array,$item);

}
}
// sort the collection of feeds
usort($feed_array, 'date_compare');
// publish as json
$feed_json = json_encode($feed_array);
echo $feed_json;
?>

The simple PHP script just takes an array of GitHub users, loops through them, pulls their RSS feeds, and then aggregates them into a single array, sorts by date, and then outputs as JSON. It is a pretty crude JSON API, but it provides me with what I need to be able to use these RSS feeds in a variety of other applications. I’m going to be mining the feeds for a variety of signals, including repo and user information, which I can then use within other applications. The best part is this type of data mining doesn’t require a GitHub API key, and is publicly available, allowing me to scale up much further than I could with the GitHub API alone.

Next, I have a couple of implementations in mind. I’m going to be creating a GitHub user leaderboard, where I stream the updates using Streamdata.io to a dashboard. Before I do that, I will have to aggregate users and repos, incrementing each commit made and publishing as a separate JSON feed. I want to be able to see the raw updates, but also just the most changed repositories, and most active users across different segments of the API space. Streamdata.io allows me to take these JSON feeds and stream them to the dashboard using Server-Sent Events(SSE) and then applying each update using JSON Patch. Making for a pretty efficient way to put GitHub to work as part of my monitoring of activity across the API space.

Make your mark on the industry’s leading annual report. Fill out the State of API Integration 2019 Survey and receive $25 to the Cloud Elements store.

Topics:
integration ,api integration ,rss ,json api

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}