in reply to WWW::Mechanize and Navigation

Why are you mixing LWP::Simple calls and WWW::Mechanize calls? While both retrieve pages from the web, they cannot easily be mixed?

Here is a way to achieve what you want via WWW::Mechanize through the links() method of the agent, which returns you the list of links on the page, out of which you then can select the links you want:

# After you've submitted your query: my @links = grep { $_->uri =~ m!liste_resultats! } $agent->links; foreach my $link (@links) { print "Retrieving $link\n"; $agent->follow( $link ); print $agent->content; $agent->back; };

Replies are listed 'Best First'.
Re^2: WWW::Mechanize and Navigation
by New Novice (Sexton) on Nov 25, 2004 at 20:23 UTC
    Thanks for this! Looks like an elegant solution!

    Unfortunately, I can't quite get it to work. Do you by any chance know, where I could find more information about the links() method? CPAN does not list the find_links method.

    Here is the code, I unsuccesfully tried, in case you are interested.

    #!/usr/bin/perl -w use strict; use WWW::Mechanize; our $count=1; our $year=1976; while ($year<1977) { my $input; my $agent = WWW::Mechanize->new(); $agent->get("http://europa.eu.int/prelex/rech_avancee.cfm?CL=en"); $agent->form(2); $agent->field("clef2", "$year"); $agent->field("clef1", 'COM'); $agent->field("nbr_element", '99'); $agent->click(); $input=$agent->content(); my @pcplinks = grep { $_->url =~ m!liste_resultats.cfm! } $agent->link +s; print @pcplinks; my $filecount; $filecount=0; foreach my $pcplink (@pcplinks) { my $input2; print "Retrieving $pcplink\n"; $agent->follow( $pcplink ); $filecount++; $input2=$agent->content(); $agent->back; } }

      You can read the WWW::Mechanize documentation either from your console window by typing perldoc WWW::Mechanize, or by looking at the documentation via http://search.cpan.org, here. This page documents the version 1.04 - you should upgrade to this version if you have a much lower version.

        Hi again,

        from what I've gathered so far, there seems to be a problem with the dereferencing. If I print out the links he finds, they are still in the form of references (...ARRAY...) altough I though the -> operator should take care of this.

        But thanks anyway.

        That's apparently the problem. With the activestate windows distribution only the 0.72 version works (tried to install 1.05, but it didn't work). Thus, there are no Link objects...

        So I guess it's back to clumsy pattern-finding via regex...