RSS Toolkit
Turn RSS Feeds into Standard Data Sources
October 30, 2009
ControlFreak
LANGUAGES:VB.NET | C#
ASP.NETVERSIONS: 2.0+
RSS Toolkit
Turn RSS Feeds into Standard Data Sources
By Steve C. Orr
The World Wide Web is overflowing with potentially usefuldata. In addition to the HTML output with which we are all familiar, many Websites expose content and data in other consumable formats, as well. In thefollowing paragraphs you ll learn how to use the components contained withinthe free, open source RSS Toolkit to manipulate RSS data feeds.
What Is RSS?
Really Simple Syndication (RSS) is an XML-based formatthat provides a simple way to publish new content notifications along withsummaries about that content. There s a nearly infinite supply of RSS feedsavailable on the Internet. When a user visits a Web site and sees a symbolsimilar to Figure 1, it s an indication that an RSS feed is available to whichthey may want to subscribe. Figure 2 shows a sample RSS feed.
Figure 1: This little symbolindicates to users that the Web site they are viewing has an RSS feed available.
http://SomeWeb.net ComputerStuff en-US ManageYour Finances Software http://someweb.net/cash.aspx ManageYour Documents Software http://someweb.net/office.aspx Laptopsand Desktops Hardware http://someweb.net/hardware.aspx InputDevices Hardware http://someweb.net/keyboards.aspx Figure 2: Anexample RSS feed. These days many companies, software applications, and Websites also utilize RSS feeds from around the Web to provide topical content totheir users. In fact, several of today s most visited Web sites rely almost entirelyon the content of others. For example, successful newcomers like Feedburner,Digg, and Reddit continue to prove that Google doesn t have a monopoly onuseful perspectives of the Web we all share. RSS has been the de facto standard content distributionformat since the early days of XML. Its star status has recently been propelledto a whole new level with the emergence of Service Oriented Architectures (SOA)and Cloud Computing. The latest Web service APIs on this front commonly supportthe option of having resulting data returned in RSS format. Such recentdevelopments are pushing the RSS format beyond its original syndicationpurposes and into a more general purpose data fulfillment role. Consuming RSS Feeds Now that RSS feeds are becoming a common data source,questions begin to surface about the best ways to work with itprogrammatically. Because RSS is essentially just text that has beenformatted in very specific ways, basically any brute force parsing techniquecould be used to extract needed bits of data. However, this is rarely anoptimal approach. RSS is an XML-based format. XML has been ubiquitous for solong now that there are almost too many options for processing XML data. Eventhe most obscure platforms and programming languages are bound to have avariety of time-saving XML libraries at their disposal. The .NET Frameworkcontains most such functionality inside the System.XML and System.Datanamespaces. While nearly any generic XML library could likely prove tobe helpful for processing RSS data, in most cases custom code must still beadded to support the more granular superset of the RSS format. In last month s column I demonstrated that XSLT can beused for transforming raw RSS data into other formats (such as HTML) that aremore attractive to end users (see XMLTransformations). However, in most cases, XSLT still is a fairly laboriousundertaking when the goal is to simply work with RSS data as if it were comingfrom any other standard data source (like a database). ASP.NET includes several controls that provide anexcellent model for working with data originating from disparate sources. Forexample, the SQLDataSource, ObjectDataSource, and SiteMapDataSource controlsall process varying data structures into more standard ones. This usefulencapsulation allows developers to consistently employ common data processingtechniques (like data binding, DataSets, etc.), regardless of the originalsource of the data. This analysis leads to the conclusion that a carefullydesigned RssDataSource control would be the optimal solution for ASP.NETdevelopers working with RSS data. Unfortunately, Microsoft has not providedsuch a control. Fortunately, an enterprising developer from their ASP.NETdevelopment team took the initiative to develop one on his own. The RSS Toolkit Dmitry Robsman is the creator of the free RSS Toolkit,which holds the RssDataSource Web control as its centerpiece. He generouslydonated the toolkit s code to the open source community via http://CodePlex.com. Its freely available C#source code is compatible with ASP.NET 2.0 and above. Several code samples areincluded to help you get started. Because this tool was not officially providedby Microsoft, it also is not officially supported by Microsoft. After downloading the RSS Toolkit from http://www.codeplex.com/ASPNETRSSToolkit,you need only add the included RSSToolkit.dll to your Visual Studio toolbox. Thiscan be accomplished via the Choose Items option on the toolbox s right-clickmenu, which enables you to then browse to the DLL (which can be found in thetoolkit project s bin folder). While the DLL can optionally be registered in the GlobalAssembly Cache (GAC), it can just as easily be placed in a project s bin folder which makes XCopy deployment a snap. The RSS Toolkit was designed to work inmedium trust scenarios, so using it on a shared host won t typically be anissue. Such hosts need only support remote outbound HTTP requests, which is arelatively common find. The RssDataSource Control As you d expect, the RssDataSource control has a familiardesign that is consistent with the data source controls included with ASP.NET. Abasic ASPX declaration is all that s needed to specify the source location ofthe RSS data feed (via its Url property). The only other property is MaxItems,which was recently added to provide an ability to limit the number of items retrievedfrom the specified feed: url="http://SteveOrr.net/rss.aspx"MaxItems="3"> Standard data binding techniques can then be implementedto bind controls to this data source, such as the purely declarative example shownhere that binds a GridView control to the RssDataSource control declared above: DataSourceID="RssDataSource1"> With just a splash of color applied, the two simpledeclarations above result in the output displayed in Figure 3.
Figure 3: Two simple ASPXdeclarations are all it takes to display potentially useful data from a remoteRSS feed. All the familiar Visual Studio data source wizards work justas you d expect with this new data source control. Available data columns areautomatically fetched from the remote data feed at design time. The resultingdata fields can all be adjusted in any way imaginable, just as if they werecoming from any other standard data source.
Figure 4: The RSS feed s availabledata fields are automatically retrieved and fully editable. Programmatic Parsing In addition to the declarative techniques mentioned above,RSS feeds also can be retrieved and bound programmatically. The following C#code snippet shows how the RSS Toolkit s RssDocument class can be used toaccomplish this: string sLoc = "http://SteveOrr.net/rss.aspx"; RssToolkit.Rss.RssDocument rss = RssToolkit.Rss.RssDocument.Load(new System.Uri(sLoc)); Image1.ImageUrl = rss.Channel.Image.Url; Repeater1.DataSource = rss.SelectItems();Repeater1.DataBind(); This technique can be especially useful for non-Webapplications, such as a command line or Windows Forms application. RSS feeds can be retrieved via URL (as shown) or loadeddirectly from an XmlReader, or even a string. Once a data feed has been loadedinto the RssDocument object, it can be converted directly to a DataSet usingthe ToDataSet method. It also can be exported to several supported XML formats usingits ToXml method. Fast Cache Because retrieving and processing remote XML files can berather processor intensive, it s a good thing that an efficient cachingmechanism has been built in to the control. Instead of continually fetching theremote RSS feed upon each page request, this data can instead be retrieved froma local cache. This local cache is kept in memory and also is persisted to disk(so it can be utilized even after processing restarts). Related configuration values can be adjusted in theappSettings section of the web.config file. Below, the time-to-live value isset to 30 minutes. This default will be used in cases where the RSS data doesnot explicitly provide such a value. If no value is specified, the default willbe 1 minute: The second value (specified by the rssTempDir key) can beused to configure the location where the local disk cache should be stored. Cachefiles saved in this location can be identified by their .feed file extension. Inmost cases, this configuration value is not strictly necessary because theRssDataSource control contains logic that will automatically find a usable tempdirectory on the server. The RssHyperlink Control Aside from the RssDataSource control, the RssHyperlink isthe only other Web control included with the RSS Toolkit. A hyperlink isdisplayed for each RssHyperlink control placed on a page, enabling users toview the RSS feed associated with that control. Additionally, the existence of one or more RssHyperlinkcontrols on a page also serves to inform modern Web browsers that RSS feeds areavailable for the current Web site. The browser s RSS symbol shown in Figure 1then springs to life, allowing users to easily subscribe to the feed(s) in aconsistent and familiar way. The RssHyperlink control implements this featureby automatically placing a link tag in the header section of the page s HTML. Suchstandardized RSS link tags look similar to this: type="application/rss+xml" title="Some WebSite's RSS Feed" href="http://someweb.net/rss.xml" /> The RssHyperlink control s optional ChannelName propertycan be used to specify a particular channel within the RSS feed, if applicable: ID="RssHyperLink1" runat="server" ChannelName="FAQ" IncludeUserName="True" NavigateUrl="MyRss.xml"> Click Here For RSS The optional IncludeUserName property can be used to passthe user s credentials to the RSS feed. The default value of the BooleanIncludeUserName property is false. When this property is set to true, any formsauthentication credentials associated with the user will be passed to the RSSfeed, allowing a custom view of the data based on the user s profile and/orpreferences. More specifically, an encrypted, Base64 encoded version of theuser s FormsAuthenticationTicket will be passed to the feed via querystring. Programmaticallygenerated feeds can then utilize the RSS Toolkit s RssHttpHandlerBase class toextract the credentials and filter the data appropriately. Advanced Features The RSS Toolkit provides a variety of other noteworthyfeatures that cannot be covered here in much detail because of spacelimitations. For example, a build provider is included that canautomatically create a strongly typed object model of any RSS feed. Of course,the main benefits of strongly typed objects include design-time IntelliSense,improved performance, and the ability to catch many common errors at compiletime instead of run time. While consuming RSS feeds is certainly a valuable feature,publishing your own feeds can be just as valuable. With this in mind, theobject model included within the RSS Toolkit can also assist in the programmaticcreation of custom RSS feeds. The latest (2.0) version of the RssDataSource control alsosupports a few variations of the RSS syndication format. These variations(ATOM, RDF, and OPML) are automatically detected and encapsulated by thecontrol. The RssDocument class described earlier can be used to convert betweenthese formats. Other new 2.0 features include automatic conversion fromrelative URLs to absolute URLs. Images and RSS extensions are also now fullysupported. Conclusion RSS has long been the standard when it comes tosyndication formats on the Web. As Service Oriented Architectures continue toexert their growing dominance in the software development world, RSS isbecoming ever more prominent. The ubiquity of RSS makes it a worthy investment oflimited development resources. By mastering this standard data exchange format,it will be easier than ever to swap content and data with Web sites andcompanies scattered across the globe. The free, open source RSS Toolkit makes this easier thanever with the code and controls contained within. The RssDataSource controlmakes it easy to bind standard ASP.NET controls to RSS feeds in familiar andintuitive ways. Its caching capabilities ensure that such functionality needn tbecome a performance bottleneck. The RSS Toolkit and its built-in RssHyperlinkcontrol also help to ease pains associated with publishing custom RSS feeds. Now that you ve had a thorough initiation into this realm,I encourage you to continue your journey by exploring the useful resourceslisted in the References sidebar. Steve C. Orr is anASPInsider, MCSD, Certified ScrumMaster, Microsoft MVP in ASP.NET, and authorof Beginning ASP.NET 2.0 AJAX (Wrox). He sbeen developing software solutions for leading companies in the Seattle areafor more than a decade. When he s not busy designing software systems orwriting about them, he can often be found loitering at local user groups andhabitually lurking in the ASP.NET newsgroup. Find out more about him at http://SteveOrr.net or e-mail him at mailto:[email protected]. ReferencesRSS Toolkit home page: http://www.codeplex.com/ASPNETRSSToolkitDmitry Robsman s blog: http://blogs.msdn.com/dmitryr/RSS Specification: http://validator.w3.org/feed/docs/rss2.htmlRSS & ASP.NET: http://SteveOrr.net/articles/RSS.aspxMy RSS Feed: http://SteveOrr.net/rss.aspxMSDN RSS Feed: http://www.microsoft.com/feeds/msdn/en-us/rss.xmlYahoo RSS Feeds: http://news.yahoo.com/rss
About the Author
You May Also Like