Using the ultra simple XML::RSS::SimpleGen, and following step by step the example, here is a piece of code that generates an RSS feed for CPAN Ratings. It is probably pretty brittle and the main regexp could probably be improved but hey, it works!

You can run this at home, or just get the feed at http://xmltwig.com/rss/cpanratings.rss.

Comments, improvements, pointers to an existing feed... are all welcome.

Update: Du'h! Of course http://cpanratings.perl.org/index.rss already has the feed... never mind then ;--( Still, I like XML::RSS::SimpleGen.

#!/usr/bin/perl -w use strict; use XML::RSS::SimpleGen; my $VERSION=0.01; my $LIVE=1; my $url = "http://cpanratings.perl.org/"; ###################################################### # set those variables my $local_rss = "/web/infotree/rss/cpanratings.rss"; my $webmaster = 'me@localhost'; ###################################################### my $local_cache = "cpanratings.html"; my $rss = XML::RSS::SimpleGen->new( $url, "CPAN Ratings", "Ratings and + Reviews for CPAN"); $rss->language( 'en' ); $rss->webmaster( $webmaster ); $rss->twice_daily(); $rss->item_limit( 25); if( $LIVE) { $rss->get_url( $url ); } else { open( my $html, '<', $local_cache) or die "$!"; local undef $/; $_=<$html>; } while( m{<h3>\s* <a \s+ href="([^"]*)">([^<]*)</a> # $1: module URL, $2: modul +e name \s*\(([^\(]*)\) # $3: version \s*<img.*?alt="(\d) \s stars?"> # $4: rating \s*</h3> \s*<p>\s* (.*?) # $5: text (first line) (\s*<br \s* />.*?)? # $6: more lines \s*</p> \s*<p> \s*<a[^>]*>([^<]*)</a> # $7: author .*? # date, unused for now \s*<br \s* /> \s*\(<a \s+ href="/([^"]*)" # $8: url }xgs ) { my( $module_url, $module_name, $module_version, $rating, $text, $m +ore, $author, $review_url) = ($1, $2, $3, $4, $5, $6, $7, $8); my $url= "http://cpanratings.perl.org/$review_url"; my $title= "$module_name ($module_version): $rating stars - by $au +thor"; my $descr= $text; if( $more) { $descr=~ s{\s*\.*\s*$}{ ...}; } $rss->item( $url, $title, $descr); unless( $LIVE) { warn "title: '$title' - url: '$url' - descr: '$de +scr'\n"; } } $rss->save( $local_rss, 5); # the 5 means that the script wi +ll scream if the RSS does not change for 5 days in a row exit; __END__ =head1 NAME gen_rss_cpanratings - generate an RSS feed for CPAN Ratings =head1 SYNOPSYS set the top variables, then run the script from cron twice a day =head1 REQUIREMENTS Perl 5, XML::RSS::SimpleGen =head1 LICENSE This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 AUTHOR Michel Rodriguez <mirod@xmltwig.com>

Replies are listed 'Best First'.
Re: RSS feed for CPAN Ratings
by mirod (Canon) on Apr 20, 2005 at 09:18 UTC

    After looking at the "official feed" I decided that, as the reviews are usually pretty short, I would like to have them included in the feed. Which is done in gen_rss_cpanratings v 0.02.

    The only changes are that the description is now built by capturing all the text of the review, markup and all. This is wrapped in a CDATA section, and passed as a reference to item, which, as the doc says: (In the unlikely event where you need to avoid the HTML-removal features, you can do this by passing scalar-references instead of normal strings, like so: rss_item($url, $title, \$not_to_be_escaped).). Easy.

    I am not quite sure having markup in the description is entirely kosher, but the feed validates and displays properly in akregator, so I think I'll keep that version (which now updates every other hour).

      You could also send me a patch to make the CPAN Ratings RSS feed a RSS 2.0 feed with the full review included (or some variation of that). Atom feed? :-)

      - ask

      -- ask bjoern hansen, http://www.askbjoernhansen.com/ !try; do();