friedo has asked for the wisdom of the Perl Monks concerning the following question:

Hi, everyone. I am having some trouble tracking down some odd behaviour with WWW::Mechanize. I am testing a web application. Among other things, I need to generate diffs between a CGI and a newer mod_perl version of the application, to ensure that they behave the same. I am attempting to follow a list of links with the same characteristics from each version of the application, but I am running into an odd problem. I am using code like the following.

use strict; use WWW::Mechanize; my $mech_cgi = WWW::Mechanize->new; my $mech_mod = WWW::Mechanize->new; $mech_cgi->get( 'http://www.foo.com/cgi/' ); $mech_mod->get( 'http://www.foo.com/modperl/' ); my @cgi_links = $mech_cgi->find_all_links( text_regex => qr/Example/ ) +; my @mod_links = $mech_mod->find_all_links( text_regex => qr/Example/ ) +; # test the first ten for( 0..9 ) { print "following link: ", $cgi_links[$_]->url, "\n"; $mech_cgi->follow_link( url => $cgi_links[$_]->url ) or die "Error following link ", $cgi_links[$_]->url; print "following link: ", $mod_links[$_]->url, "\n"; $mech_mod->follow_link( url => $mod_links[$_]->url ) or die "Error following link ", $mod_links[$_]->url; # do some stuff print "finished link $_\n"; }

Everything seems to go fine up to following the first link in the loop. (#0.) After that, when I attempt to follow the second link, my script dies with the error. Unfortunately I haven't yet figured out how to get a more useful error message from WWW::Mechanize. (I'm sure I'm missing something in the docs.)

Here is the output I get:

[mike@localhost ~]$ perl webdiff.pl following link /cgi-bin/m.cgi?mid=261847852 following link /perl/m?mid=261847852 finished link 0 Error following link /cgi-bin/m.cgi?mid=305436842 at webdiff.pl line 9 +1.

I appreciate any help. Thanks.

Update: Formatting error.

Replies are listed 'Best First'.
Re: WWW::Mechanize and following multiple links
by Limbic~Region (Chancellor) on Nov 24, 2004 at 15:24 UTC
    friedo,
    I am assuming you want to use $mech_cgi->back() and $mech_mod->back() after testing each link. See this for an example where I did something similar.

    Cheers - L~R

      Thanks, Limbic! That appears to be exactly what I needed. Mechanize is now happily following all the links. Who knew you had to press the back button to go back? :)

      Isn't it amazing how the most baffling problems always seem to have the most obvious solutions?

Re: WWW::Mechanize and following multiple links ( use WWW::Mechanize::Clones)
by diotalevi (Canon) on Nov 24, 2004 at 17:17 UTC

    I would tend to propose that you extend WWW::Mechanize to support having multiple browser sessions.

    package WWW::Mechanize::Clones; use strict; use warnings; use Storable 'dclone'; sub WWW::Mechanize::clone { dclone $_[0] } __END__ =head1 NAME WWW::Mechanize::Clones - "fork" browser windows =head1 DESCRIPTION This allows you to "fork" a WWW::Mechanize browser so you can follow m +ultiple links. =head1 SYNOPSIS use WWW::Mechanize; use WWW::Mechanize::Clones; my $browser = WWW::Mechanize->get( ... ); my @browsers = map( { my $new_browser = $browser->clone; $new_browser->follow_link( $_ ); } $browser->find_all_links( ... ) ); =head1 ADDED METHODS =over 4 =item $browser->clone This additional method is added to the WWW::Mechanize class so that yo +u can make a copy of your browser at a point in time. This allows you + to follow multiple links from the same page without having to back t +rack. Here is an example of how to do this without this module: $browser->follow_link( ... ); # do some work $browser->back; $browser->follow_link( ... ); And here is how you can how do this. You no longer have to keep track +of how many times to use ->back to get back to where you can follow t +he other link. my $new_browser = $browser->clone; $browser->follow_link( ... ); $new_browser->follow_link( ... ); =back =cut
Re: WWW::Mechanize and following multiple links
by petdance (Parson) on Nov 24, 2004 at 21:12 UTC
    This is unrelated to Mech, but it makes me terribly nervous to see people using $_ for loops of more than one line. You're playing with a global variable, and I sure hope nothing stomps on it. It's trivial to change that to
    for my $i ( 0..9 ) { blah blah $link[$i]->whatever; }
    and now you're not going to have something somewhere else modify $_ when you're not expecting it.

    xoxo,
    Andy

      I might get nervous, but not extremely nervous. Subs (and methods) shouldn't clobber $_ unless they're documented to — and with a purpose. For every line in your block it should be clear to what exactly it does to $_.

      In such a case, multiline blocks using $_ are fine.

      p.s. Your anxiousity doesn't warrant to be restricted to just for loops. You seem to be wary of anything messing with $_, anywhere. Probably you never use $_ over several lines.

      petdance,
      I tend to pay close attention to people who have had a fair amount of experience. When I first read what you wrote, I got a little nervous because I often use foreach loops without specifying a looping variable. I thought about it a bit, and just about everything in perl that sets $_ localizes it first like map, grep, foreach loops, etc (while loops excluded). I didn't do an exhaustive search (checked the CB) but the list was quite short of things that I found that modify $_:
      • chomp
      • chop
      • s///
      Since $_ is aliased, I am intentionally changing $_ if I use one of those. Am I completely missing the boat here? I am not saying there isn't any case where this can't bite you but is it more common than I think or are you just protecting against Murphy's Law?

      Cheers - L~R

        Those while loops that you are excluding are the primary danger here; the problem scenario usually manifests something like:
        sub foo { while (<$fh>) { ... } } for (@important) { ... foo(); ... }
        and presto, elements in @important are replaced by lines from $fh.

      Andy,
      Is it ok to dclone a WWW::Mechanize object to avoid having to deal with ->back? If so, could you add that or a similarly functioning method to the distribution? That'd sure make my life easier.

        I'm not sure what exactly you're wanting to do, but submit a request with a patch, or at least sample code that you imagine it to be, to bug-www-mechanize@rt.cpan.rog.

        xoxo,
        Andy