WWW::Mechanize get doesn't work

zingbust has asked for the wisdom of the Perl Monks concerning the following question:

Hi, Instead of simply using the get method of mechanize the way it is suggested, like this....

$m->get($link);
[download]

....where $link is a WWW::Mechanize::Link object, I found that occasionaly it would hang up if the URL was invalid and my program would stop working. Because of this, it was suggested that I change my code so that it now reads like this....

        my $connected = eval {
          $m->get($link);
          1
        };
        if (! $connected) {
           #do something else
        }
[download]

....and this seemed to do the trick. However, after getting a few hundred valid URLs, all of a sudden, even this way of constructing it failed to move on when a perfectly valid link was found as follows... http://www.aaaedm.com/contact.htm ....this particular link was found by a previous fetch using the get method. If you try to go to that URL in IE, nothing unusual happens, yet my $m->get just doesn't work at all, hanging up the program at that point, even though there's nothing at all wrong with that URL. Can anyone tell me why $m->get fails for this particular web address after previously working correctly on hundreds of other URLs? Since I have about 10,000 more to go through, I assume the same problem is going to happen again. Perhaps if I knew why it was happening, I could come up with some sort of solution. thanks!

Comment on WWW::Mechanize get doesn't work Select or Download Code

Replies are listed 'Best First'.
Re: WWW::Mechanize get doesn't work by GrandFather (Saint) on Mar 22, 2012 at 00:01 UTC
The code change won't make any difference to a "hang", but will hide errors that are reported by the calling code dieing. Generally hiding such errors is a bad thing so I wouldn't do that. The following trivial code works fine fetching your problematic page: `use strict; use warnings; use LWP::Simple; my $page = get('http://www.aaaedm.com/contact.htm'); print $page;` [download] Maybe there are warnings or errors earlier in your code execution that are causing the problem you see at the point in the code you describe. Can you whittle your current script down to a half dozen or so lines that shows the issue for the URL you give? True laziness is hard work	[reply] [d/l]
Re^2: WWW::Mechanize get doesn't work by Anonymous Monk on Mar 22, 2012 at 03:16 UTC
but will hide errors that are reported by the calling code dieing. No, it won't hide the errors, which are available in perlvar#$@ if you look.	[reply]
Re^3: WWW::Mechanize get doesn't work by GrandFather (Saint) on Mar 22, 2012 at 03:29 UTC
Or in this case, if you know to look. The OP's comments smack somewhat of cargo culting so suggesting that the code he adopted is potentially "hiding" errors is pertinent. True laziness is hard work	[reply]
Re^4: WWW::Mechanize get doesn't work by Anonymous Monk on Mar 22, 2012 at 06:33 UTC
Re: WWW::Mechanize get doesn't work by choroba (Cardinal) on Mar 21, 2012 at 22:27 UTC
Strange. This works for me: `use WWW::Mechanize; $mech = WWW::Mechanize->new(); $mech->get("http://www.aaaedm.com/contact.htm"); print $mech->find_link(url_regex => qr/^mailto:/) ->url, "\n";` [download] Prints:`mailto:info@aaaedm.com`	[reply] [d/l] [select]