RSS


by Jim Rapoza

What is RSS, and why should you be interested in it?

You should care about RSS if a) you’re interested in regularly updated newsfeeds from around the Web a la the old PointCast; b) you’d like to add syndicated content from news sites and blogs to your Web site; or c) you want to make it possible for sites to syndicate your content—thus driving more traffic your way—and for visitors to see updated headlines in a newsreader.

(more…)

Introduction

RSS is one the hottest technologies at the moment, and even big web publishers (such as the New York Times) are getting into RSS as well. However, there are still a lot of websites that do not have RSS feeds.

If you still want to be able to check those websites in your favourite aggregator, you need to create your own RSS feed for those websites. This can be done automatically with PHP, using a method called screen scrapping. Screen scrapping is usually frowned upon, as it’s mostly used to steal content from other websites.

I personally believe that in this case, to automatically generate a RSS feed, screen scrapping is not a bad thing. Now, on to the code!

Getting the content

For this article, we’ll use PHPit as an example, despite the fact that PHPit already has RSS feeds (http://www.phpit.net/syndication/).

We’ll want to generate a RSS feed from the content listed on the frontpage (http://www.phpit.net). The first step in screen scraping is getting the complete page. In PHP this can be done very easily, by using implode(file(”", “[the url here]”)); IF your web host allows it. If you can’t use file() you’ll have to use a different method of getting the page, e.g. using the CURL library (http://www.php.net/curl).

Now that we have the content available, we can parse it for the content using some regular expressions. The key to screen scraping is looking for patterns that match the content, e.g. are all the content items wrapped in <div>’s or something else? If you can successfully discover a pattern, then you can use preg_match_all() to get all the content items.

For PHPit, the pattern that match the content is <div class=”contentitem”>[Content Here]<div>. You can verify this yourself by going to the main page of PHPit, and viewing the source.

Now that we have a match we can get all the content items. The next step is to retrieve the individual information, i.e. url, title, author, text. This can be done by using some more regular expression and str_replace() on the each content items.

By now we have the following code;

<?php

// Get page
$url = "http://www.phpit.net/";
$data = implode("", file($url)); 

// Get content items
preg_match_all ("/<div class=\"contentitem\">([^`]*?)<\/div>/", $data, $matches);

Like I said, the next step is to retrieve the individual information, but first let’s make a beginning on our feed, by setting the appropriate header (text/xml) and printing the channel information, etc.

// Begin feed
header ("Content-Type: text/xml; charset=ISO-8859-1");
echo "<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n";
?>
<rss version="2.0"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:admin="http://webns.net/mvcb/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel>
		<title>PHPit Latest Content</title>
		<description>The latest content from PHPit (http://www.phpit.net), screen scraped!</description>
		<link>http://www.phpit.net</link>
		<language>en-us</language>

<?

Now it’s time to loop through the items, and print their RSS XML. We first loop through each item, and get all the information we get, by using more regular expressions and preg_match(). After that the RSS for the item is printed.

<?php
// Loop through each content item
foreach ($matches[0] as $match) {
	// First, get title
	preg_match ("/\">([^`]*?)<\/a><\/h3>/", $match, $temp);
	$title = $temp['1'];
	$title = strip_tags($title);
	$title = trim($title);

	// Second, get url
	preg_match ("/<a href=\"([^`]*?)\">/", $match, $temp);
	$url = $temp['1'];
	$url = trim($url);

	// Third, get text
	preg_match ("/<p>([^`]*?)<span class=\"byline\">/", $match, $temp);
	$text = $temp['1'];
	$text = trim($text);

	// Fourth, and finally, get author
	preg_match ("/<span class=\"byline\">By ([^`]*?)<\/span>/", $match, $temp);
	$author = $temp['1'];
	$author = trim($author);

	// Echo RSS XML
	echo "<item>\n";
		echo "\t\t\t<title>" . strip_tags($title) . "</title>\n";
		echo "\t\t\t<link>http://www.phpit.net" . strip_tags($url) . "</link>\n";
		echo "\t\t\t<description>" . strip_tags($text) . "</description>\n";
		echo "\t\t\t<content:encoded><![CDATA[ \n";
		echo $text . "\n";
		echo " ]]></content:encoded>\n";
		echo "\t\t\t<dc:creator>" . strip_tags($author) . "</dc:creator>\n";
	echo "\t\t</item>\n";
}
?>

And finally, the RSS file is closed off.

</channel>
</rss>

That’s all. If you put all the code together, like in the demo script, then you’ll have a perfect RSS feed.

Conclusion

In this tutorial I have shown you how to create a RSS feed from a website that does not have a RSS feed themselves yet. Though the regular expression is different for each website, the principle is exactly the same.

One thing I should mention is that you shouldn’t immediately screen scrape a website’s content. E-mail them first about a RSS feed. Who knows, they might set one up themselves, and that would be even better.

Download sample script at http://www.phpit.net/viewsource.php?url=/demo/screenscrape%20rss/example.php

About The Author

Dennis Pallett is a young tech writer, with much experience in ASP, PHP and other web technologies. He enjoys writing, and has written several articles and tutorials. To find more of his work, look at his websites at http://www.phpit.net, http://www.aspit.net and http://www.ezfaqs.com

This article is intended as a guide for webmasters who want to display automatically updated content on their website in the form of RSS feeds. In this article I will cover the easiest method to implement using javascript for displaying RSS on websites to create additional dynamic content. This will allow you to display headlines from syndicated content around the web on your website.

RSS to Javascript.

By far the easiest method is to use client side javascript to parse and display the headlines on your site. To achieve this all you need to do is cut and paste some HTML or javascript code into the web page where you want the RSS feed headlines to display.

To achieve this there are several sites that offer a free service that will allow you to select a few options to choose your feed source and display formatting parameters. You will then be presented with some javascript code that you can cut and paste into your website.

Now before I give you the address of the sites that offer this service freely there are a few points I need to clarify with you. Although you will achieve your goal of displaying dynamic content on your site in a few short minutes there are some downsides to this method.

Javascript is not search engine friendly.

As you may or may not already know, javascript is not visible to search engine spiders. They will not see the RSS feed you have parsed into your site and so this will not benefit you if you are doing this to better your search engine rankings.

You are using a third party service.

The second potential downside is that although the javascript is on your site you are actually calling a script on another server. This could lead to a couple of potential problems. If the server is busy it will mean the news feed will take longer to display on your site. The other point is if the third party server fails or disappears altogether then your feed will not be displayed at all.

In summary there are a few downsides, but if you do not code and want some feeds on your site quickly then this is the way to go. Ok, so now you understand what is involved here are the links to the sites that provide the free RSS to javascript service. All you need to do is follow the on site instructions.

Feed2JS
RSS2HTML
RSS-to-Javascript
FeedSweep
RSS Xpress Lite

About The Author

Allan is the webmaster at http://www.newsniche.com an RSS resopurce for webmasters. Learn how to use RSS to attract and retain visitors to your site.

An RSS news feed can be used to communicate with your target audience. It is an ideal means of notifying people of new content on your website without the need for them to keep on visiting your site. You can send newsletters to your readership without having to use email and risk being accused of spamming. You will be comfortable in the knowledge that people who request your feed are actually interested in it because they have actively subscribed to it. This article will explain just how to create your own RSS news feed.

There are a couple of ways to create an RSS file, you can use an editor designed for the purpose or you can create a file using a simple text editor. The latter will require you to learn some XML whilst the former will do the hard part for you. First off I will describe an RSS file, there are several versions and I will be showing you version 2.0, the latest RSS version.

An RSS file looks just like an HTML file except it has different tags and the files end in .rss or .xml rather than .html. The file is made up of header information and item information, the item information contains the actual news items.

The first section of the file contains the header information. This states that the file is XML and which version, the encoding used and the version of RSS that you are using. This part of the file is mandatory. Next up is the channel tag, this encloses the whole of the rest of the file. This is followed by a title, description and link which explain the what the feed is about and what website it is associated with. The final part of the header is the optional image information. If you use this the software that is used to parse or read your file can display a small picture such as a logo.

<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0">
<channel>
<title>The Widget news feed</title>
<description>The latest news on widgets</description>
<link>http://www.widget.com/</link>
<image>
<title>Widget News</title>
<url>http://www.widget.com/widget.gif</url>
<link>http://www.widget.com/</link>
</image>

The body of the file is made up of the news items. Each news item is enclosed in the item tag and comprises of a title, a description and a published date. The date needs to be in the format shown in the example below.

<item>
<title>Which is the best Widget to?</title>
<description>In this article we discuss the release of several new widgets, but which is the best widget.</description>
<pubDate>Sun, 20 Mar 2005 14:38:50 GMT</pubDate>
<link>http://www.widget.com/the-best-widgets.html</link>
</item>

You can have as many items in the feed as you like but many webmasters just show the 10 most recent items to keep the bandwidth usage down and also so not to overwhelm the end user with too many items.

Finally the file is ended with the closing channel tag and a closing RSS tag.

</channel>
</rss>

I have covered the basic tags to create an RSS file, there are other tags that can be used and these are explained in the RSS 2.0 specification.

Once you have created your file you will need to verify it is ok, to do this upload the file to your server and then go to this validator to validate the file. Your file is now ready. Now anyone can subscribe to your feed just by pointing their RSS reader to your RSS file.

That is the basics covered. I will be covering other areas in future articles as there is far too much information to fit into a single article.

About The Author

Allan is the webmaster at http://www.newsniche.com an RSS resopurce for webmasters. Learn how to use RSS to attract and retain visitors to your site.

You have created an RSS feed, or maybe you have several feeds and you post at least once a week to keep your RSS subscribers interested. So now you can sit back and watch your hit counter tally up all those extra visitors. Well you may think you have finished but there are a few things you can do to improve things still further.

1. Let browsers and search engines know you have an RSS available.

To do this you need to add a line of HTML to every page that has a link to your RSS feed. The link will be:

 link rel="alternate" type="application/rss+xml" title="title" href="http://www.site.com/rss.xml"

This line will need to be enclosed by angled brackets and placed between your HEAD tags of your page. Once you have done this some search engine bots and web browsers such as Firefox will know you have an RSS feed available.

2. Submit your RSS feed to the RSS and Blog directories.

Like you would submit your site to search engines you can submit your site to RSS specific directories. This will give your RSS greater exposure to an audience that is already interested and educated about the benefits of RSS. This is a list of Alexa ranked RSS directories and another can be found here.

3. Announce that your RSS feed has been updated.

Every time you add a new item to your RSS feed you can announce it to the world, or at least many of the RSS directories. To do this you need to ping each service. If you are using Blogging software your software is probably already doing this for you , check your documentation.

For those of you who do not have software set up to automatically ping for you there is no need to worry. There is a free service at Ping-o-matic that will do this for you.

About The Author

Allan is the webmaster at http://www.newsniche.com an RSS resopurce for webmasters. Learn how to use RSS to attract and retain visitors to your site.

There is lively debate about the republishing of RSS feeds on other sites. The argument surrounds the use of RSS feeds from the feed publisher being used in an unfair manner. This includes republishing the entire articles and not displaying sufficient credit to the original source.

Before we go into the details you may want to brush up on your understanding of RSS. This will help you fully appreciate and fully understand the issues involved.

I am glad this conversation is happening now as it needs to be made clear what fair use of RSS feeds actually means. There may be webmasters who are republishing RSS feeds in all innocence at the moment not realising the furore that is going on around them with regards to their republishing activities. I would like to help clear up any misunderstandings that surround RSS republishing.

Being an RSS publisher myself who is considering republishing other authors RSS feeds I would like to make sure I am not treading on any toes. I am basing the following RSS republishing etiquette on the good practice that Rok Hrastnik has enthused.

If you wish to republish an RSS feed then you should first consult the publisher with your intentions. This would be an email to the author stating how you wish to reuse their feed and the page or pages the feed will be republished on and the attributions you will make. You will need to clarify some points. If the authors feed contains ads then will they be republished? Will you be monetizing the authors work by placing ads on your republished page? To avoid conflict these issues need to be sorted out.

The general guidelines Rok Hrastrnik has provided state that the article title must link back to the original article. If the RSS feeds contains a complete article only an excerpt, Rok suggests 100 to 200 words, can be republished. A link should be provided to the article source, the website of the original publisher.

Further to this it is suggested that no archives are kept on the republished site and no full articles are used. I would suggest permission is sought from the original author if you wish to keep an archive on your site.

You can follow this discussion further at PR meets the WWW and Micro persuasion.

About The Author

Allan is the webmaster at NewsNiche an RSS resource for webmasters. Learn how to use RSS to attract and retain visitors to your site.

newsniche.com  

This article is intended as a guide for webmasters who want to display automatically updated content on their website in the form of RSS feeds. In this article I will cover the easiest method to implement using javascript for displaying RSS on websites to create additional dynamic content. This will allow you to display headlines from syndicated content around the web on your website.

RSS to Javascript.

By far the easiest method is to use client side javascript to parse and display the headlines on your site. To achieve this all you need to do is cut and paste some HTML or javascript code into the web page where you want the RSS feed headlines to display.

To achieve this there are several sites that offer a free service that will allow you to select a few options to choose your feed source and display formatting parameters. You will then be presented with some javascript code that you can cut and paste into your website.

Now before I give you the address of the sites that offer this service freely there are a few points I need to clarify with you. Although you will achieve your goal of displaying dynamic content on your site in a few short minutes there are some downsides to this method.

Javascript is not search engine friendly.

As you may or may not already know, javascript is not visible to search engine spiders. They will not see the RSS feed you have parsed into your site and so this will not benefit you if you are doing this to better your search engine rankings.

You are using a third party service.

The second potential downside is that although the javascript is on your site you are actually calling a script on another server. This could lead to a couple of potential problems. If the server is busy it will mean the news feed will take longer to display on your site. The other point is if the third party server fails or disappears altogether then your feed will not be displayed at all.

In summary there are a few downsides, but if you do not code and want some feeds on your site quickly then this is the way to go. Ok, so now you understand what is involved here are the links to the sites that provide the free RSS to javascript service. All you need to do is follow the on site instructions.

Feed2JS
RSS2HTML
RSS-to-Javascript
FeedSweep
RSS Xpress Lite

About The Author

Allan is the webmaster at http://www.newsniche.com an RSS resopurce for webmasters. Learn how to use RSS to attract and retain visitors to your site.

You can make it easier for your visitors to subscribe to your RSS feed. With a free and easy to install javascript function you can add the QuickSub feed button to your webpage in just a few minutes. Let me show you just how easy it is.

QuickSub is a javascript mouseover function that produces a list of RSS feed readers that you can use to subscribe to your RSS news feed with one click. You can see it in action on my RSS resource site, just move the mouse over the subscribe link. You should see a list of RSS feed readers. If you click on one of the news reader links it will open up that RSS reader and add this feed to it. You will need the particular news reader installed on your computer for this to work. So for example if your visitor uses SharpReader as their RSS reader then they would click on the Sharpreader link and this would add your feed to your visitors RSS reader.

To use QuickSub on your site you will first need to download the javascript and CSS files from QuickSubs site. The file is compressed so you will need to unzip the file which will leave you with quicksub.css and quicksub.js as well as a sample html file.

Upload the CSS as javascript file to your server. Now you will need to add some code into your web pages. You will need to do this for all of the pages that you wish to use QuickSub on.

Please note that in these examples I have used square brackets instead of angled brackets.

First you need to copy some code to call the CSS file. Add this line with your head tags.

<style type=”text/css”> @import “quicksub.css”; </style>

Then copy this code into the body of your page.

<div id="quickSub" style="position:absolute; visibility:hidden; z-index:1000;" onMouseOut="return timeqs();" onMouseMove="return delayqs();"></div>

<script language=”JavaScript” src=”quicksub.js”][!– quickSub (c) Jason Brome –></script>

Then where you want to use QuickSub place this code in the body of your page.

 <a href="http://www.sitename.com/rssfeed.xml" onmouseout="return timeqs();" onmouseover="return quicksub(this, 'http://www.sitename.com/rssfeed.xml');">Your link text here</a>

You just need to replace the path with the path to you RSS feed and enter you own link text. All is left now is to upload your modified page to your web server and the new QuickSub javascript will be active.

About The Author

Allan is the webmaster at NewsNiche an RSS resource for webmasters. Learn how to use RSS to attract and retain visitors to your site.

In this article I am going to cover some tools that you can use that will allow you to publish RSS feeds on your site. This will allow you to have fresh, updated content on your site and you have control of what sort of content you display and how often it is updated.

First off if you do not know much about RSS or feel you require more information take a look at this RSS publishers FAQ and then rejoin us again later.

There are several ways you can go about publishing RSS content, two of which this article will cover are using third party software that will take care of the RSS republishing for you. The second is to use some freely available PHP code to generate your RSS pages.

If you do not what PHP is or have little knowledge or PHP or programming then I would recommend that you use RSS Equalizer which takes care of the complicated stuff for you. RSS Equalizer produces HTML format pages that it has transformed from the RSS feeds it is using as its source.

RSS Equalizer is a PHP script that runs from your server so you will need to make sure your host can run PHP, most web hosts do. Once installed and set up RSS Equalizer can be left to parse content from the RSS feeds and produce a readable HTML format pages on your website.

If you have any programming experience or know a little PHP then there are some other free tools that you can use. These PHP scripts will allow you to parse RSS feeds and if you know PHP will give you more options for customisation. These tools are CaRP, Last RSS and zFeeder.

If you have the time and feel you can handle the PHP then the free PHP scripts above will be your best option. If you neither have the time or the inclination and want the hard work already done for you then try out RSS Equalizer, its not free but it’s the best option for the non programmer.

About The Author

Allan is the webmaster at NewsNiche an RSS resource for webmasters. Learn how to use RSS to attract and retain visitors to your site.

newsniche.com