System.ServiceModel.Syndication or how to read RSS feeds in .NET
Join the DZone community and get the full member experience.
Join For FreeToday, a lot of websites have their content provided in feeds that users can subscribe to. Feeds basically are a simplified version of the site content, providing text and media –only representation of the data published on the website. One of the most popular feed formats is RSS and it stands for Really Simple Syndication. It is XML-based and is used by the majority of websites that contain dynamically updated content, like blogs or news resources.
Being XML based and having a well-defined structure, it is correct to think that it can be read as a regular XML document and this is absolutely true. However, there is System.Servicemodel.Syndication that can make the feed reading process a bit easier.
So let’s look at a specific example. There is the DZone Link Feed, located here:
http://feeds.dzone.com/dzone/frontpage?format=xml
It’s an easy way to keep up with the upcoming interesting content, so I’d like to integrate this into my application. If I take a look at the source for the feed, I can see content similar to this:
<item>
<title>Google Web Toolkit Blog: Google Maps API for GWT 1.1.0 released</title>
<link>http://feeds.dzone.com/~r/dzone/frontpage/~3/wrVQN_7G8Hk/google_web_toolkit_blog_google_maps_api_for_gwt_1.html</link>
<description>We are pleased to announce the Google Maps API for the Google Web Toolkit 1.1.0 release. The Google Maps API library provides a way to access the Google Maps API from a GWT project without having to write additional JavaScript code.</description>
<category>frameworks</category>
<category>java</category>
<category>javascript</category>
<category>web services</category>
<pubDate>Thu, 03 Jun 2010 18:54:42 GMT</pubDate>
<guid isPermaLink="false">http://www.dzone.com/links/423703.html</guid>
<dc:creator>@thierry_lefort</dc:creator>
<dc:date>2010-06-03T18:54:42Z</dc:date>
<content:encoded><![CDATA[<a href='http://www.dzone.com/links/rss/google_web_toolkit_blog_google_maps_api_for_gwt_1.html'><img src='http://cdn.dzone.com/images/thumbs/120x90/423703.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid #ccc;' /></a><p style='margin-left: 130px;'>We are pleased to announce the Google Maps API for the Google Web Toolkit 1.1.0 release. The Google Maps API library provides a way to access the Google Maps API from a GWT project without having to write additional JavaScript code.<br/><br/><a href='http://www.dzone.com/links/rss/google_web_toolkit_blog_google_maps_api_for_gwt_1.html'><img src='http://www.dzone.com/links/voteCountImage?linkId=423703' border='0'/></a></p><img src="http://feeds.feedburner.com/~r/dzone/frontpage/~4/wrVQN_7G8Hk" height="1" width="1"/>]]></content:encoded>
<dz:linkId>423703</dz:linkId>
<dz:submitDate>2010-06-02T10:12:27Z</dz:submitDate>
<dz:promoteDate>2010-06-03T18:54:42Z</dz:promoteDate>
<dz:voteUpCount>6</dz:voteUpCount>
<dz:voteDownCount>0</dz:voteDownCount>
<dz:clickCount>90</dz:clickCount>
<dz:commentCount>0</dz:commentCount>
<dz:thumbnail>http://www.dzone.com/links/images/thumbs/120x90/423703.jpg</dz:thumbnail>
<dz:submitter>
<dz:username>Thierry.Lefort</dz:username>
<dz:userimage>http://www.dzone.com/links/images/avatars/252611.gif</dz:userimage>
</dz:submitter>
<feedburner:origLink>http://www.dzone.com/links/rss/google_web_toolkit_blog_google_maps_api_for_gwt_1.html</feedburner:origLink>
</item>
As you can see here, an item is basically a feed entry that contains information about one content entity. Note that there are some site-specific tags that I am not covering here, but that are still readable via XmlDocument.
An important note that has to be made here is that .NET (not using any third-party components) only supports RSS 2.0 feeds (I am not including Atom here).
To get started, add a reference to System.ServiceModel and System.ServiceModel.Syndication. Now, in the class header, add the statement below:
using System.ServiceModel.Syndication;
Now you’re all set. Let’s try and read the feed above. To do this, I will need an instance of SyndicationFeed and XmlReader. Here is how the code looks like:
SyndicationFeed feed = SyndicationFeed.Load(XmlReader.Create("http://feeds.dzone.com/dzone/frontpage"));
This will read the actual feed. And since I mentioned that the feed is composed out of multiple items, all of them are stored in the Items collection. You can go through it via this:
foreach (SyndicationItem item in feed.Items)
{
Debug.Print(item.Title.Text);
}
The code above will get the titles of each item included in the feed. A question that might arise is why do I have to call the Text property when Title should already be of type string. This is a wrong assumption, since Title in fact is of type TextSyndicationContent, therefore this means that it can contain HTML, XHTML or plain text. Therefore, converting it directly ToString() will not give correct results.
For each of the items I can get the summary – a short representation of the published content:
foreach (SyndicationItem item in feed.Items)
{
Debug.Print(item.Summary.Text);
}
Please remember that item.Summary is not the same as item.Content. As you see in the feed sample I provided, the content is surrounded by CDATA, meaning that it can provide HTML data to the receiving end.
If I am going to call item.Content I will get an “Object reference not set to an instance of an object.” due to the fact that it is content:encoded instead of content.
To read the content in this case, I am going to use this:
foreach (SyndicationItem item in feed.Items)
{
foreach (SyndicationElementExtension ext in item.ElementExtensions)
{
if (ext.GetObject<XElement>().Name.LocalName == "encoded")
Debug.Print(ext.GetObject<XElement>().Value);
}
}
This will prevent the above mentioned exception and will read the string representation of the item content.
Opinions expressed by DZone contributors are their own.
Comments