morgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi

all of a sudden I have a strange problem with WWW::Mechanize that I've boiled down to this:

use WWW::Mechanize(); my $mech = WWW::Mechanize->new; $mech->agent_alias( 'Linux Mozilla' ); my $url = "http://www.economist.com/news/world-week/21716670-politics- +week/print"; $mech->get($url, ":content_file" => "output.html");
What happens is that the output file contains only part of the expected content (a fragment of an html-document that looks as if it is truncated somewhere).

wget has no problems to download the url correctly.

What could be the issue here?

Many thanks!

Replies are listed 'Best First'.
Re: problem with WWW::Mech
by LanX (Saint) on Feb 11, 2017 at 21:02 UTC
    Are you aware that the economist wants to be payed after the 3rd article?

    Otherwise only the first 2 or 3 paragraphs are shown.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: problem with WWW::Mech
by morgon (Priest) on Feb 11, 2017 at 21:21 UTC
    When I dump the response-object that gets returned from the call to "get" I can see this line:
    'x-died' => 'Illegal field name \'X-Meta-Article:publisher\' at /home/ +mh/perl5/perlbrew/perls/perl-5.16.2/lib/site_perl/5.16.2/x86_64-linux +/HTML/HeadParser.pm line 207.',
    I think the x-died header is inserted when a die occurs somewhere, which would maybe explain why I don't see the full content.

    Is there a way to hack around this?

      Ok.

      I've updated the HTML::HeadParser and all is fine again.