XML::RSS parse probem

btubby has asked for the wisdom of the Perl Monks concerning the following question:

I'm using XML::RSS to extract certain fields from RSS feeds, but am experiencing problems extracting all the data I can see in the source (via the rss object).

For example, the following URL;
http://blogs.news.com.au/moneystuff/index.php/xml/rss_popular/14

contains a single rss item. I need to get a the comment data. item source below;

        <item>
          <title><![CDATA[Beware of financial leeches]]></title>
          <link>http://blogs.news.com.au/moneystuff/index.php/news/com
+ments/beware_of_financial_leeches1/</link>
          <guid>http://blogs.news.com.au/moneystuff/index.php/news/com
+ments/beware_of_financial_leeches1/#60515</guid>

          <pubDate>Sun, 13 Sep 2009 20:40:00 GMT</pubDate>
          <description><![CDATA[INSTEAD of sucking your blood, financi
+al leeches suck your cash and can suck the life out of your relations
+hip with them.]]></description>
          <dc:source>blogs.news.com.au/moneystuff/index.php</dc:source
+>
          <dc:contributor>Anthony Keane</dc:contributor>
          <category>Money</category>
          <category>News</category>

          <slash:comments>2</slash:comments>
          <ndm:comments publishedtotal="3" itemtotal="5">
            
  <ndm:comment>
    <ndm:name>The Other Martin</ndm:name>
    <ndm:email>n/a</ndm:email>
    <ndm:ip>n/a</ndm:ip>
    <ndm:url>http://blogs.news.com.au/moneystuff/index.php/news/commen
+ts/beware_of_financial_leeches1</ndm:url>

    <ndm:date>Mon, 14 Sep 2009 00:15:26 GMT</ndm:date>
    <ndm:body><![CDATA[You haven&#8217;t mentioned the Greatest of all
+ Australian Leaches &#45; the ATO! Folowed closely ehind by the large
+r lesser leaches (State Governments) and smaller lesser&#8230;]]></nd
+m:body>
  </ndm:comment>

  <ndm:comment>
    <ndm:name>JP</ndm:name>
    <ndm:email>n/a</ndm:email>
    <ndm:ip>n/a</ndm:ip>

    <ndm:url>http://blogs.news.com.au/moneystuff/index.php/news/commen
+ts/beware_of_financial_leeches1</ndm:url>
    <ndm:date>Sun, 13 Sep 2009 23:16:21 GMT</ndm:date>
    <ndm:body><![CDATA[Tell them: We have to be careful because our re
+sources are not the best a little like you.  If they do not get&#8230
+;]]></ndm:body>
  </ndm:comment>

          </ndm:comments>
        </item>
[download]

but the following code does not work as expected; (assuming $source is a scalar containing the RSS source)

my $rss = XML::RSS->new();
$rss->parse($source);
foreach my $item ( @{ $rss->{items} } ) {
    print Dumper($item);
}
[download]

The Dumped output is messed up, eg;

'http://feeds.news.com.au/dtd/blogcomments/' => {                     
+                                     

'date' => 'Mon, 14 Sep 2009 00:15:26 GMTSun, 13 Sep 2009 23:16:21 GMT'
+,
'ip' => 'n/an/a',
'name' => 'The Other MartinJP',
}
[download]

i.e the 2 comment blocks have been combined into one. Why is this happening? Any help appreciated. Thanks

20090928 Janitored by Corion: Added formatting, code tags, as per Writeup Formatting Tips

Comment on XML::RSS parse probem Select or Download Code

Replies are listed 'Best First'.
Re: XML::RSS parse problem by btubby (Novice) on Sep 26, 2009 at 13:44 UTC
Anyone?? This is really bugging me.. pretty please!	[reply]
Re^2: XML::RSS parse problem by toolic (Bishop) on Sep 26, 2009 at 14:23 UTC
Since I have never used XML::RSS, I can only offer generic advice. A Super Search yields 19 threads related toXML::RSS. Perhaps one of them may give you a clue: ?node_id=3989;HIT=xml%20rss;re=N Look at the module's source code (for the version you have installed). Post a question on the Discussion forum I wouldn't know an RSS if I was kicking one, but have you tried alternate CPAN modules? I wouldn't be suprised if XML::Twig could handle this format. Contact the module author by email.	[reply]