<sigh> I've searched and tried all day and just can't figure this out. </sigh>

What I need is to go to a site, search for something, grab every href for each hit, then go to the next page of results and do the same until there's no more pages.

So, let's say I enter "Harry Potter" as my search term at www.mywebsite.com... The page that is returned to me has a form which includes what page I'm looking at as well as how many pages are left. There are 20 results per page. I need to grab each result and then go onto the next page by posting the form.

Does that make sense?

Here's what I've got. I can't even get it to go to page 2.

#!/usr/bin/perl -w use strict; use LWP::Simple; use WWW::Mechanize; use HTML::Form; my $url = 'http://www.ncbi.nlm.nih.gov/sites/entrez?term=rnr2&cmd=Sear +ch&db=nuccore&QueryKey=17'; my $browser = WWW::Mechanize->new; my $site = $browser->get($url); die( "Can't get $url -- ", $site->status_line ) unless $site->is_success; $browser->form('EntrezForm'); foreach my $item($browser->form('EntrezForm')){ my $nextPage = ""; my $maxPage = ""; my $field=""; my $fieldValue = ""; print "\n"."-----NewPage-----"."\n"; while( my ($k, $v) = each %$item ) { if ($k eq "action"){ my $action = $v; print "\n\n"."ACTION: ".$action."\n"; } if ($k eq "method"){ my $method = $v; print "\n\n"."METHOD: ".$method."\n"; } if ($k eq "attr") { print "\n\n"."ATTRIBUTES"."\n"; while( my ($k, $v) = each %$v ) { print "key: $k, value: $v.\n"; } } if ($k eq "inputs"){ print "\n\n"."INPUTS"."\n"; my @newarray = @$v; foreach my $thisItem(@newarray){ while (my($key, $value) = each %$thisItem){ if ( (($key eq "name") && ($value eq "EntrezSystem2. +PEntrez.Nuccore.Sequence_ResultsPanel.Pager.PageNumber"))|| (($key eq "name") && ($value eq "EntrezSystem2. +PEntrez.Nuccore.Sequence_ResultsPanel.Pager.MaxPage")) ) { $field = $value; if ($field =~ m/PageNumber/){$nextPage=($field +Value+1);$browser->set_fields("$field" => "$nextPage",);} if ($field =~ m/MaxPage/){$maxPage=$fieldValue +;} print $field." => ".$fieldValue."\n"; } if ($key eq "value"){ $fieldValue = $value; } } } } } #parse HTML to get <a>links</a> of each organism hit #save links to file for use after this big loop if ($nextPage <= $maxPage) { $browser->submit(); print "submit"; $browser->content; $browser->form('EntrezForm'); } }

In reply to post, return, parse, repeat by ShayShay

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.