How to Merge RSS Feeds Using Php & Zend Framework
Join the DZone community and get the full member experience.
Join For FreeZend Framework is becoming a very comprehensive set of widely needed components for PHP development. As other frameworks offer similar components, one of Zend's Framework greatest strengthens is the fact that you can use its components as stand alone components and not only as part of the MVC structure. In this tutorial I will show how you can easily use it's Zend_Feed component to merge feeds.
Most of the basic actions, like importing an RSS feed, creating an RSS feed and more are covered in the Zend Framework manual. In this post I will elaborate more on the more advanced topics, like sorting and merging RSS feeds but I will also go briefly over the more basic stuff as well.
So let's begin.
First of all we need to load the RSS feed, this is being done quite easily:
function loadFeed ($url) {
try {
$feed = Zend_Feed::import($url);
} catch (Zend_Feed_Exception $e) {
// feed import failed
return null;
}
return $feed;
}
Now that we have a feed we can read its properties, for example let's print out the title:
$feed = loadFeed ('http://www.arikfr.com/blog/feed/');
echo $feed->title();
Or iterate over all of the feed entries and print their titles:
foreach ($feed as $entry) {
echo $entry->title();
}
(the Zend_Feed_Abstract, which both the Zend_Feed_Rss and Zend_Feed_Atom implement, is implementing the Iterator interface so we can use it in an foreach loop).
Now that we covered the basics let's start constructing a new feed. You can make an RSS or Atom feed using Zend_Feed. We will create an RSS feed. Creating the feed is being done by constructing an array that will hold the new feed properties. There are mandatory fields and some optional fields. In this example I will cover all the mandatory ones (you can find the complete structure here):
$merged_feed = array (
'title' => 'ArikFr.com merged Feed',
'link' => 'http://www.arikfr.com/merged_feed.php',
'charset' => 'UTF-8',
'entries' => array (),
);
I'm leaving the entries field empty, and we will fill it later with entries from the feeds we want to merge. Now that we have the basic array read , we can create a Zend_Feed_Rss object from it:
$rssFeedFromArray = Zend_Feed::importArray($merged_feed, 'rss');
And we can output it to the browser simple by calling the send method of the Zend_Feed_Rss object:
$rssFeedFromArray->send();
Now all the browser will get is an empty RSS feed:
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" version="2.0">
<channel>
<title><![CDATA[ArikFr.com merged Feed]]></title>
<link>http://www.arikfr.com/merged_feed.php</link>
<description></description>
<pubDate>Mon, 25 Feb 2008 20:55:37 +0000</pubDate>
<generator>Zend_Feed</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
</channel>
</rss>
By now all we have done was quite simple and very basic. Now let's start importing the entries from the feeds we want to merge:
function getEntriesAsArray ($feed) {
$entries = array();
foreach ($feed as $entry) {
$entries[] = array (
'title' => $entry->title(),
'link' => $entry->link(),
'guid' => $entry->guid(),
'lastUpdate' =>strtotime($entry->pubDate()),
'description' => $entry->description(),
'pubDate' => $entry->pubDate(),
);
}
return $entries;
}
All what this function does is iterate over all of the entries of the given feed and return them as an array. I don't really like this approach and it has some flaws, because not all feeds are having the same fields and sometimes it would be better just to copy the original node element to the new feed. As far as I know, it can't be done (easily at least) with Zend_Feed (anyway I didn't find a way, so if you know different - please drop me a line). Anyhow, the code sample above creates a feed entry with the most basic (and mandatory) fields.
Also notice that I convert the publish date of each RSS item to a timestamp using the strtotime function. This function can take a time or date values in various formatting and converts them back to timestamp - very useful.
Now if we just use it as is, meaning:
$merged_feed['entries'] = array_merge (
getEntriesAsArray ($feed1),
getEntriesAsArray ($feed2));
We will have all the entries from both blogs in the new feed, but they won't be sorted. The basic array sort functions of PHP can easily sort numbers or strings, but if we want to sort something more complex we need to use the usort function. If you're unfamiliar with the usort - it's an array sorting function that takes a custom function for comparison of the items (see the manual page for more info). So let's write our comparison function:
function cmpEntries ($a , $b) {
$a_time = $a['lastUpdate'];
$b_time = $b['lastUpdate'];
if ($a_time == $b_time) {
return 0;
}
return ($a_time > $b_time) ? -1 : 1;
}
This function will sort the entries' timestamp in a descending order - like the entries should be. Now that we have the comparison function, sorting the RSS entries is quite easy:
usort ($merged_feed['entries'], 'cmpEntries');
And now we're practically finished. The image below shows how the merged feed looks before the sort (left) and after the sort (right):
(I'm sorry that the time stamps are in Hebrew - that's because of my computer's locale settings. Just notice the time and dates).
There are some things we can improve in this example, like defining the number of entries we take from each post, adding to each post title the original blog name and maybe caching the entries from the feeds so we don't have to read it each time the RSS readers pings the page (using Zend_Cahce). If anyone interested, I will cover this issues in next posts (just leave a comment).
It came out longer than I imagined, I guess that's because of all the source code examples. Any comments, ideas and better practices are mostly welcomed!
Published at DZone with permission of Arik Fraimovich. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Avoiding Pitfalls With Java Optional: Common Mistakes and How To Fix Them [Video]
-
A Deep Dive Into the Differences Between Kafka and Pulsar
-
10 Traits That Separate the Best Devs From the Crowd
-
Strategies for Reducing Total Cost of Ownership (TCO) For Integration Solutions
Comments