in reply to RSS Parsing not working on new machine

The relevant code is

... my $rss = $parser->parse_string($feed); if(!$rss) { print " -- Feed is broken -- \n"; next FEED; }

The next step would be to look what is in $feed, and to remove the rest of the script to reproduce the problem.

In your code you already save the content to a file, so maybe you can reduce your code to something like:

my $feed = read_file( 'saved.rss' ); my $rss = $parser->parse_string($feed); if(!$rss) { print " -- Feed is broken -- \n"; next FEED; }

... but maybe the feed you retrieve already is empty?

Replies are listed 'Best First'.
Re^2: RSS Parsing not working on new machine
by wintermute115 (Acolyte) on Apr 15, 2025 at 10:19 UTC
    As I say, the identical code looking at identical feeds works fine on another machine, so I don't think it's a problem with the feeds themselves. A minimal version of:
    #!/usr/bin/perl use strict; use XML::RSS::Parser; my $feed = "/home/ross/Downloads/New_Podcasts/archive/GMNV.rss"; my $parser = XML::RSS::Parser->new(); my $rss = $parser->parse_file($feed); print $rss . "\n";
    gives me:
    $ ./test.pl XML::RSS::Parser::Feed=HASH(0x5ba514d23270)

    which is what I'd expect to see, rather than the undef I get from the actual script. Yes, that file is one saved by this script.

    Changing it to point at the archived RSS rather than the version stored in memory doesn't fix the problem, though. Something is happening somewhere else that is stopping the parser from reading this file, and I can't see what it might be.

      If reading the file seems weird, maybe it is an issue of file/directory permissions?

      Can you check whether the user your cron job runs as can access all directories in the path, starting from / ?

      Also, does the minimal script still work when running from cron ?

        I haven't yet gotten to the point of running it from cron; so far I'm still testing it on the command line, but it's successfully writing these files, so permissions don't seem to be a problem.

        I have the minimal test reading successfully from cURL, and doing everything that the main script is doing to the Parser object.

        #!/usr/bin/perl use strict; use Data::Dumper; use Net::Curl::Easy qw(:constants); use XML::RSS::Parser; my $ua_string = "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; r +v:30.0) Gecko/20100101 Firefox/30.0"; my $curl = Net::Curl::Easy->new; my $feedsrc = "https://feeds.simplecast.com/o4MKcfjK"; my $feed; $curl->pushopt(CURLOPT_HTTPHEADER, [$ua_string]); $curl->setopt(CURLOPT_NOPROGRESS, 0); $curl->setopt(CURLOPT_FOLLOWLOCATION, 1); $curl->setopt(CURLOPT_CONNECT_ONLY, 0); $curl->setopt(CURLOPT_URL, $feedsrc); $curl->setopt(CURLOPT_WRITEDATA, \$feed); $curl->perform(); my $parser = XML::RSS::Parser->new(); $parser->register_ns_prefix('lc_itunes', 'http://www.itunes.com/dtds/p +odcast-1.0.dtd'); my $rss = $parser->parse_string($feed); print $rss . "\n";
        Which results in:
        $ ./test.pl % Total % Received % Xferd Average Speed Time Time Time + Current Dload Upload Total Spent Left + Speed 100 164k 100 164k 0 0 1267k 0 --:--:-- --:--:-- --:--: +-- 1271k XML::RSS::Parser::Feed=HASH(0x618af3317380)

        So that's working. I just can't see why it's breaking as part of the actual script, when it's clearly readable in isolation, and works fine on a different machine.