Ajax RSS reader
Use Ajax to build an RSS reader
Level: Advanced Document options
Jack D Herrington (jherr@pobox.com), Senior Software Engineer, Leverage Print this page
Software Inc.
E-mail this
12 May 2006 page
Document
Learn how to build an Asynchronous JavaScript and XML (Ajax) Really Simple options
Syndication (RSS) reader, as well as a Web component that you can place on requiring
any Web site to look at the articles in the RSS feeds. JavaScript are
not displayed
The first thing I thought about doing when I read about requesting Extensible
Markup Language (XML) from JavaScript code on a Web page was to get some
Sample code
RSS and display it. But I immediately ran into the security issue of XML
Hypertext Transfer Protocol (HTTP), where a page that comes from Free download:
www.mysite.com can't address pages from anywhere other than
www.mysite.com. My plans to build a generic RSS reader in just the page were Using Apache
dashed. But Web 2.0 is all about ingenuity, and solving the problem of how to
create an RSS reader with XMLHTTP teaches a lot about how to program the 2.0
Tomcat but
Web. need to do
more?
This article walks through the construction of an Ajax-based RSS reader using
both XMLHTTP and tags as the transport mechanisms. Rate this page
Building the server side Help us
improve this
The server side of the
Download the code for this article content
equation comes in two pieces.
The code listings in this article do not display
The first is the database, and the second is a set of PHP
all of the code required to build the RSS
pages that allow you to add feeds, request the list of
reader. For the complete code, see
feeds, and get the article associated with a particular
Download.
feed. I'll start with the database.
The database
For this article, I use a MySQL database. Listing 1 shows the schema.
Listing 1. The schema for the database
CREATE TABLE rss_feeds (
rss_feed_id MEDIUMINT NOT NULL AUTO_INCREMENT,
url TEXT NOT NULL,
name TEXT NOT NULL,
last_update TIMESTAMP,
PRIMARY KEY ( rss_feed_id )
);
CREATE TABLE rss_articles (
rss_feed_id MEDIUMINT NOT NULL,
link TEXT NOT NULL,
title TEXT NOT NULL,
description TEXT NOT NULL
);
There are two tables. The rss_feeds table contains the list of feeds. And the rss_articles table contains the
list of articles associated with each feed. When the system updates the articles, it deletes all of the current
articles associated with the given rss_feed_id and then refreshes the table with the new set of articles.
The database wrapper
The next step is to wrap the database with a set of PHP classes that build the business logic for the
application. This starts with the DatabaseConnection singleton that manages the connection to the
database, as shown in Listing 2.
Listing 2. The DatabaseConnection singleton in rss_db.php
_handle =& DB::Connect( $dsn, array() );
}
public function handle()
{
return $this->_handle;
}
}
This is a standard PHP singleton pattern. It connects to the database and returns a handle through the
handle method. The two require_once statements are another interesting part of this code. The first
references the PHP Extension and Application Repository (PEAR) DB module that connects to the database.
The second references the XML_RSS module that parses RSS feeds. I admit it; I used modules here because
I'm far too lazy to worry about parsing all of the different forms of RSS. If you don't have these modules
installed, use these on the command line:
% pear install DB
And:
% pear install XML_RSS
The DB module is commonly installed, but the XML_RSS module isn't.
The next step is to build a class that wraps the list of feeds so that you can add a feed, get a list of feeds,
and so on. Listing 3 shows this class.
Listing 3. The FeedList class in rss_db.php
class FeedList {
public static function add( $url ) {
if ( FeedList::getFeedByUrl( $url ) != null ) return;
$db = DatabaseConnection::get()->handle();
$rss =& new XML_RSS( $url );
$rss->parse();
$info = $rss->getChannelInfo();
$isth = $db->prepare( "INSERT INTO rss_feeds VALUES( null, ?, ?, null )" );
$db->execute( $isth, array( $url, $info['title'] ) );
$info = FeedList::getFeedByUrl( $url );
Feed::update( $info['rss_feed_id'] );
}
public static function getAll( ) {
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_feeds" );
$rows = array();
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) { $rows []= $row; }
return $rows;
}
public static function getFeedInfo( $id ) {
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_feeds WHERE rss_feed_id=?",
array( $id ) );
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) { return $row; }
return $null;
}
public static function getFeedByUrl( $url ) {
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_feeds WHERE url=?", array( $url ) );
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) { return $row; }
return null;
}
public static function update() {
$db = DatabaseConnection::get()->handle();
$usth1 = $db->prepare( "UPDATE rss_feeds SET name='' WHERE rss_feed_id=?" );
$usth2 = $db->prepare( "UPDATE rss_feeds SET name=? WHERE rss_feed_id=?" );
$res = $db->query(
"SELECT rss_feed_id,name FROM rss_feeds WHERE last_updatefetchInto( $row, DB_FETCHMODE_ASSOC ) ) {
Feed::update( $row['rss_feed_id'] );
$db->execute( $usth1, array( $row['rss_feed_id'] ) );
$db->execute( $usth2, array( $row['name'], $row['rss_feed_id'] ) );
}
}
}
The add method adds a feed to the list and updates the feed. The getAll method returns a list of all of the
feeds. The getFeedInfo method returns the information for a given feed. The getFeedByUrl method does the
same thing as the getFeedInfo method, but it does it using the URL of the feed as a key. And the update
function calls the update method on the given feed if that feed hasn't been updated in the last ten minutes.
Listing 4 shows the Feed class, which is the final class in the business logic classes. It has methods that deal
with an individual feed.
Listing 4. The Feed class from rss_db.php
class Feed
{
public static function update( $id )
{
$db = DatabaseConnection::get()->handle();
$info = FeedList::getFeedInfo( $id );
$rss =& new XML_RSS( $info['url'] );
$rss->parse();
$dsth = $db->prepare( "DELETE FROM rss_articles WHERE rss_feed_id=?" );
$db->execute( $dsth, array( $id ) );
$isth = $db->prepare( "INSERT INTO rss_articles VALUES( ?, ?, ?, ? )" );
foreach ($rss->getItems() as $item) {
$db->execute( $isth, array( $id,
$item['link'], $item['title'],
$item['description'] ) );
}
}
public static function get( $id )
{
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_articles WHERE rss_feed_id=?",
array( $id ) );
$rows = array();
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) )
{
$rows []= $row;
}
return $rows;
}
}
?>
The update method uses the RSS parser to get the feed and update the database. And the get method
returns the current contents of the articles table for the given feed.
The PHP service pages
The first page you need to use is the add.php page, in Listing 5, to add feeds to the list.
Listing 5. add.php
This is a very simple wrapper around the add method on the FeedList class. The tag at the bottom
satisfies the need for this to return some type of XML indicating the success or failure of the process.
The next page is the list.php page, in Listing 6, that returns the list of feeds in the database.
Listing 6. list.php
formatOutput = true;
$root = $dom->createElement( 'feeds' );
$dom->appendChild( $root );
foreach( $rows as $row )
{
$an = $dom->createElement( 'feed' );
$an->setAttribute( 'id', $row['rss_feed_id'] );
$an->setAttribute( 'link', $row['url'] );
$an->setAttribute( 'name', $row['name'] );
$root->appendChild( $an );
}
header( "Content-type: text/xml" );
echo $dom->saveXML();
?>
To make it easier to write the XML properly, I use the Document Object Model (DOM) functions in the PHP
core to create an XML DOM on the fly. Then I use the saveXML function to format it for output.
If I browse to this page using my Firefox® browser, I see the output in Figure 1.
Figure 1. The feed list XML page
Of course, this is after I have added eight feeds to the list.
The final page I need to build before getting into the client side of the system is the read.php page, in
Listing 7, that returns the articles associated with a given feed ID.
Listing 7. read.php
formatOutput = true;
$root = $dom->createElement( 'articles' );
$dom->appendChild( $root );
foreach( $rows as $row )
{
$an = $dom->createElement( 'article' );
$an->setAttribute( 'title', $row['title'] );
$an->setAttribute( 'link', $row['link'] );
$an->appendChild( $dom->createTextNode( $row['description'] ) );
$root->appendChild( $an );
}
header( "Content-type: text/xml" );
echo $dom->saveXML();
?>
This is very similar in form to the list.php page. I use the Feed class to get the list of articles. Then I use the
XML DOM object to create the XML and output it. When I browse to this page in Firefox, I see the output in
Figure 2.
Figure 2. The XML from the read.php page
That finishes the server side of the equation. Now I need to put together a Dynamic Hyper Text Markup
Language (DHTML) page that uses Ajax to use these PHP pages.
Building the client
The next thing to do is build a client that uses the PHP pages. I'll build it in three phases so you can follow
along. The first version, in Listing 8, displays a control that shows the list of feeds.
Listing 8. index2.html
Ajax RSS Reader
body { font-family: arial, verdana, sans-serif; }
var g_homeDirectory = 'http://localhost/rss/';
var req = null;
function processReqChange( handler ) {
if (req.readyState == 4 && req.status == 200 && req.responseXML ) {
handler( req.responseXML ); }
}
function loadXMLDoc( url, handler ) {
if(window.XMLHttpRequest) {
try { req = new XMLHttpRequest(); } catch(e) { req = false; }
}
else if(window.ActiveXObject)
{
try { req = new ActiveXObject("Msxml2.XMLHTTP"); } catch(e) {
try { req = new ActiveXObject("Microsoft.XMLHTTP"); } catch(e) { req = false; } }
}
if(req) {
req.onreadystatechange = function() { processReqChange( handler ); };
req.open("GET", url, true);
req.send("");
}
}
function parseFeedList( dom ) {
var elfl = document.getElementById( 'elFeedList' );
elfl.innerHTML = '';
var nl = req.responseXML.getElementsByTagName( 'feed' );
for( var i = 0; i
getFeedList();
The page has one control on it, the control. This control is filled by the getFeedList function that
requests the list.php page from the server. When the page is loaded, the parseFeedList function adds the
items to the control.
When I browse to this in Firefox, I see the output in Figure 3.
Figure 3. The first version of the RSS reader
To get these first few feeds into the system, I use the MySQL interface to add them manually.
The next step is to display the content of the selected feed. Listing 9 shows the upgraded code.
Listing 9. index3.html
Ajax RSS Reader
body { font-family: arial, verdana, sans-serif; }
.title { font-size: 14pt; border-bottom: 1px solid black; }
.title a { text-decoration: none; }
.title a:hover { text-decoration: none; }
.title a:visited { text-decoration: none; }
.title a:active { text-decoration: none; }
.title a:link { text-decoration: none; }
.description { font-size: 9pt; margin-left: 20px; }
var g_homeDirectory = 'http://localhost/rss/';
var req = null;
function processReqChange( handler ) { ... }
function loadXMLDoc( url, handler ) { ... }
function parseFeed( dom ) {
var ela = document.getElementById( 'elArticles' );
ela.innerHTML = '';
var elTable = document.createElement( 'table' );
var elTBody = document.createElement( 'tbody' );
elTable.appendChild( elTBody );
var nl = req.responseXML.getElementsByTagName( 'article' );
for( var i = 0; i
getFeedList();
I omitted the processReqChange and loadXMLDoc functions because they are the same as before. The new
code is in the loadFeed and parseFeed functions that request data from the read.php page, parse it, and
add it to the page.
Figure 4 shows the output of this page in Firefox.
Figure 4. The upgraded page that shows the article list
The next step is to finish the page with the ability to add a feed to the list through the add.php page. This
final code for the page is in Listing 10.
Listing 10. index.html
Ajax RSS Reader
...
var g_homeDirectory = 'http://localhost/rss/';
// The same transfer functions as before
function addFeed()
{
var url = prompt( "Url" );
loadXMLDoc( g_homeDirectory+'add.php?url='+escape( url ), parseAddReturn );
window.setTimeout( getFeedList, 1000 );
}
function loadFeed( id ) { loadXMLDoc( g_homeDirectory+'read.php?id='+id, parseFeed ); }
function getFeedList() { loadXMLDoc( g_homeDirectory+'list.php', parseFeedList ); }
getFeedList();
Most of the code here is the same, but I have inserted a new Add Feed... button that opens a dialog box
where you can insert a new URL into the feed list. To make it easy on myself, I have the browser wait for
two seconds and then get the new feed list after the feed has been added.
Figure 5 shows the finished page.
Figure 5. The finished page
Now this is pretty cool. But I'm not satisfied because the XMLHTTP security prevents me from taking the
JavaScript code from this page and copying it onto someone else's blog so that anyone can look at the
feeds. To do that, I need to re-engineer the services to use the tag and the JavaScript Object
Notation (JSON) syntax.
Going from XML to JSON
For this article, I'm only going to allow the feeds to be viewed through the script syntax, although I really
could go the whole way using script tags as the data transport mechanism. To get to the feeds, I first need
the feed list encoded as JavaScript code. So I create a list_js.php page as shown in Listing 11.
Listing 11. list_js.php
setFeeds( [ ] );
When I run this script on the command line, I see the output in Listing 12.
Listing 12. Output from list_js.php
setFeeds( [
{ id:1, link:'http://muttmansion.com/ds/index.xml', name:'Driving Sideways' },
{ id:2, link:'http://slashdot.org/slashdot.rdf', name:'Slashdot' },
{ id:3, link:'http://muttmansion.com/vl/index.xml', name:'Visible Light' },
{ id:4, link:'http://muttmansion.com/sor/index.xml', name:'Socks on a Rooster' },
{ id:5, link:'http://muttmansion.com/dd/index.xml', name:'Doxie Digest' },
{ id:6, link:'http://rss.cnn.com/rss/cnn_topstories.rss', name:'CNN.com' },
{ id:7, link:'http://rss.cnn.com/rss/cnn_world.rss', name:'CNN.com - World' },
{ id:8, link:'http://rss.cnn.com/rss/cnn_us.rss', name:'CNN.com - U.S.' } ] );
This is conducive to a tag. When the browser loads this, the setFeeds function is called with the list
of the feeds. That, in turn, sets up the control and loads the first feed.
I also need the equivalent of the read.php function that returns article data in JavaScript code instead of
XML. Listing 13 shows the read_js.php page.
Listing 13. read_js.php
addFeed( ,
[ ] );
Once again, after I run this script on the command line, I see the output in Listing 14.
Listing 14. Output from read_js.php
addFeed( 1,
[ { title:'War',
link:'http://www.muttmansion.com/ds/archives/002816.html',
description:'The...' }, ... ] );
I've truncated it here for brevity, but you get the point. The addFeed function is called with the ID of the
feed and the article data encoded in JavaScript format.
With these new JavaScript-enabled services, I can now create a new page that uses the services. Listing 15
shows this new page.
Listing 15. script.html
Script Component Test
...
var g_homeDirectory = 'http://localhost/rss/';
function loadScript( url ) {
var elScript = document.createElement( 'script' );
elScript.src = url;
document.body.appendChild( elScript );
}
function addFeed( id, articles ) {
var ela = document.getElementById( 'elArticles' );
ela.innerHTML = '';
var elTable = document.createElement( 'table' );
var elTBody = document.createElement( 'tbody' );
elTable.appendChild( elTBody );
for( var a in articles ) {
var title = articles[a].title;
var link = articles[a].link;
var description = articles[a].description;
// Create elements as before...
}
ela.appendChild( elTable );
}
function setFeeds( feeds ) {
var elfl = document.getElementById( 'elFeedList' );
elfl.innerHTML = '';
var firstId = null;
for( var f in feeds ) {
var elOption = document.createElement( 'option' );
elOption.value = feeds[f].id;
elOption.innerHTML = feeds[f].name;
elfl.appendChild( elOption );
if ( firstId == null ) firstId = feeds[f].id;
}
loadFeed( firstId );
}
function loadFeed( id ) { loadScript( g_homeDirectory+'read_js.php?id='+id ); }
function getFeedList() { loadScript( g_homeDirectory+'list_js.php' ); }
getFeedList();
This page is similar to the original index.html page. However, instead of using the loadXMLDoc function, I
use a new function, called loadScript, that creates a tag dynamically. The tag then loads
the JavaScript code from the specified URL.
These script tags call the read_js.php and list_js.php pages. These pages, in turn, create JavaScript code
that calls back to the setFeeds and addFeed functions in the host page.
When I go to the page, my browser displays what is in Figure 6.
Figure 6. The RSS reader that uses tags for the data
The big advantage of this code is that anyone can use the View Source command to view the script from
the page and copy the code into their own pages. Then their pages will use the PHP services that return
JavaScript code to update the page.
Conclusion
In this article, I demonstrated how to use two different techniques to access data dynamically from a Web
page to create an RSS reader on the page. Hopefully, you can use the concepts and code provided here to
enrich your own application without having to entirely retool your code. That's the real value of Ajax -- if
you are familiar with Web technologies, it's a snap to upgrade the interactivity of your page with a few new
services on the server side and a little code on the client side.