in reply to Re^2: Parsing RSS Feeds
in thread Parsing RSS Feeds

There are many RSS modules already on CPAN, but if you want to do it as an exercise in module writing, there is perlmodtut and Simple Module Tutorial

Replies are listed 'Best First'.
Re^4: Parsing RSS Feeds
by shekarkcb (Beadle) on Mar 25, 2010 at 11:31 UTC
    Basically I need to parse different types of feeds. So i thought of writing separate module. Feeds could be Pure RSS feed, Pure RSS Media feed, Youtube Feed OR Media cum rss feed OR Metacafe feed OR Yahoo Feed OR BLIP TV etc.
    This is the reason i thought of wrinting a general module. Do we have any modules that can be used for the same ? if yes please suggest. I thought the approach would be like this,
    • Check the type of feed (RSS/ Media RSS, Apple, Yahoo, Youtube etc
    • Pass different fun to take care of them
my Question - How can i differentiate between feeds?
if($val !~ /(.*)<rss version=\"(1|2\.0)\"(.*)/is ) { die qq{ Not a RSS Feed....Dying here..\n }; }
Just trying to figure out whether it is a rss feed or not?.
Some rss contains <?xml...bla bla..>and then (next line)<rss..
But this is not working. I just need to match optional <xml before <rss , it may be on the same line or next ..
Pls help.

Thanks,
Shekar

      For RSS parsing, I would try an XML parser.

      For the rest, I'm no expert on how the different feeds are different. A Perl module for consuming various RSS feeds is Plagger, maybe you can look at that.

      Personally, I would, for each different format, create XPath queries that extract the interesting payload from the RSS documents. Almost all XML parsers will supply you with an XPath engine.