Quicksilver has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I've been following some suggestions regarding automating an RSS feed and have begun adapting some code that Merlyn pointed me towards on his site. It should read an RSS feed and then email out anything that is new (I realise that dbmopen is old code and tie is preferred - that's next). For some reason, if a new post comes in, then the whole file is being emailed and not just the new post which is what I had hoped. Could anybody please let me know what I've done incorrectly (since if there are no new posts then no email).
use strict; use warnings; use LWP::Simple; use LWP::UserAgent; use XML::LibXML; my $DIR = "C:\\Webroot\\rdf"; my $from = 'me'; my $to = 'me'; my $subject = "tweeting annoying"; my $ticket = qw(localhost/rss); my @news = ("localhost/rss", => "TestList",); chdir $DIR or die "Cannot chdir $DIR: $!"; my @output; while (@news >= 2) { my ($url, $localname) = splice @news, 0, 2; dbmopen my %SAW, $localname, 0644 or warn "Cannot open %SAW for $lo +calname: $!"; my $feed = get($ticket); my $parser = XML::LibXML->new; my $doc = $parser->parse_string($feed); my %seen; my @item_output; for my $item($doc) { my $date = $doc->findvalue('rss/channel/item/pubDate'); my $desc = $doc->findvalue('rss/channel/item/description'); $seen{$date} = localtime; next if $SAW{$date}; push @item_output, $desc; } %SAW = %seen; if (@item_output) { push @output, @item_output; } } if (@output) { require Net::SMTP; my $smtp = Net::SMTP->new(Host => 'mailhost'); $smtp->mail( $from ); $smtp->to( $to ); $smtp->data(); $smtp->datasend("To: $to\n"); $smtp->datasend("From: $from\n"); $smtp->datasend("Subject: $subject\n"); $smtp->datasend("\n"); # done with header $smtp->datasend("@output\n"); $smtp->dataend(); $smtp->quit(); # all done. message sent. }
All I'm trying to get is the description as it contains all the necessary information I need. MTIA for any help.

Replies are listed 'Best First'.
Re: Automating an RSS feed
by davorg (Chancellor) on Aug 06, 2008 at 17:09 UTC
    for my $item($doc) {

    This almost certainly isn't doing what you think it is. $doc isn't a list, it's a reference to an XML::Lib document. So effectively, all you're doing there is aliasing $item to $doc. Of course you completely ignore $item within the block, so it's not doing any actual damage. But your loop is only ever being run once.

    I think you actually want something more like this (but please note this is untested):

    for my $item ($doc->findnodes('//channel/item')) { my $date = $item->findvalue('pubDate'); my $desc = $item->findvalue('description'); $seen{$date} = localtime; next if $SAW{$date}; push @item_output, $desc; }

    But it's also worth pointing out that if you're dealing with RSS documents, then XML::RSS is far easier than using a generic XML tool like XML::LibXML.

    --

    See the Copyright notice on my home node.

    "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg

      Thanks. I'll give this a go later. As you say XML::RSS is pretty much the way to go but I was looking at a server which didn't have it installed and wanted to see if I could get this working without it as an experiment.
Re: Automating an RSS feed
by Your Mother (Archbishop) on Aug 06, 2008 at 23:52 UTC

    Here is something a bit more modern. See XML::Feed, DateTime, MIME::Lite, and friends.

    use strict; use warnings; use XML::Feed; use MIME::Lite; use DateTime; use URI; my $uri = URI->new("http://news.google.com/news?q=perl&output=atom"); # parse can take raw xml, file names, and more my $feed = XML::Feed->parse($uri) or die XML::Feed->errstr; my $last_48 = DateTime ->now( time_zone => 'floating' ) ->subtract( hours => 48 ); exit unless 1 == DateTime->compare( $feed->modified, $last_48 ); my $body = ""; for my $entry ( $feed->entries ) { next unless 1 == DateTime->compare( $entry->modified, $last_48 ); $body .= "<h3>" . $entry->title . "</h3>\n"; $body .= "<div>" . $entry->content->body . "</div>\n"; } warn "No body... loves me!" unless $body; my $msg = MIME::Lite ->new( From => 'me@myhost.com', To => 'you@yourhost.com', Subject => $feed->title, Type => "text/html", Data => $body, ); print $msg->as_string, "\n"; # $msg->send;
        Many thanks Davorg and Your Mother. As per above, I'm just experimenting with an older box where its always fun to add new modules. That and I wanted to see what goes on inside from curiosity. I'll certainly be exploring the newer code for any production purpose.