shekarkcb has asked for the wisdom of the Perl Monks concerning the following question:

Greetings,

I am thinking to write a general script that intakes rss feed url and separates tags out of it and prints the results.
Mainly focusing on online video related sites like abc, cnn etc. Since the tags using for different sites are different,
Is there anyway i can put everything together ? or any cpan module already written?
(I can use WWW::Mechanize for getting the contents or XML::FeedPP , XML::RSS::Parser for parsing - but limited number of methods to take out the tags).
Also when you are not sure of the tags they are used inside the xmls.
Can anybody suggest how should i proceed...!!!

Thanks,
ShekarKCB

Replies are listed 'Best First'.
Re: Parsing RSS Feeds
by moritz (Cardinal) on Mar 23, 2010 at 06:45 UTC
    There are two kinds of "tags": for one there are XML tags like <item>, and secondly people call key words associated with other media "tags", although the word "label" would be more descriptive.

    Which one do you mean?

    RSS standardizes which XML tags to use, so if you use an RSS parser, you don't have to worry about different XML tags on different sites.

    If different sites use different labels, there is no general mapping from one to the other, but maybe in some specific case you can find rules for it.

    Can anybody suggest how should i proceed...!!!

    Start your work, write a script that downloads and parses the RSS. If you encounter a concrete problem, come back to us, and show us example data.

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Parsing RSS Feeds
by Khen1950fx (Canon) on Mar 23, 2010 at 08:24 UTC
    Here's a one-liner that might give you some ideas:
    curl http://rss.cnn.com/rss/cnn_freevideo.rss | perl -ne 'm/>([^<].*?[ +^>])<\// && print $1 . "\n"'
    I got that one here.

      Thinking to write one module does all parsing/getting values for the passed values. Can anybody suggest any good tutorial for writing Perl module ? ( i am not asking the code but any tutorial to refer).


      Thanks,
      Shekar