wfischer has asked for the wisdom of the Perl Monks concerning the following question:

Attempting to automate a series of queries to an HIV research tool ("http://www.hiv.lanl.gov/content/sequence/HYPERMUT/hypermut.html"; within scope of acceptable use, BTW).

Although I can read the page into a WWW::Mechanize object, and define the file I need to upload, there are two buttons on the page -- I need to click the second one ("Run") and am unable to access it through find_all_submits() or find_all_inputs(): only the first shows up! E.g., via find_all_submits() in the perl debugger:

x $mechanize->find_all_submits() 0 HTML::Form::SubmitInput=HASH(0x7f9fab008238) '/' => '/' 'class' => 'button' 'name' => 'btnG' 'onmouseout' => 'this.style.background=\'#e3e3e3\';' 'onmouseover' => 'this.style.background=\'#f3600a\';' 'style' => '{width: 80px; font-size: 9pt;}' 'type' => 'submit' 'value' => 'Search Site' 'value_name' => ''

Nevertheless, after I save the object to a file, mech-dump shows both the first submit button,"btnG," and the second one ("submit=Run (submit)" -- the one I need):

% mech-dump /tmp/hypermut.html GET http://searcher-green.lanl.gov/search q= (text) btnG=Search Site (submit) client=outside_lanl (hidden readonly) ... site=HIV (hidden readonly) POST file:///cgi-bin/HYPERMUT/hypermut.cgi (multipart/form-data) FORMAT=FASTA (option) [IG (IntelliGenetics)|MSF| +GDE|*FASTA|PHYLIP (Interleaved)|PHYLIP (Sequential)|SLX|TABLE] ALIGNMENT= (textarea) upfile1= (file) ... ... submit=Run (submit) <NONAME>=<UNDEF> (reset)

$mechanize->submit() hits the first (search) button -- not what I need. I suspect the problem has something to do with the two sets of fields that mech-dump returns (GET vs. POST): what do these mean? I can define upload1 just fine, but how can I hit the "Run" button afterwards?

Minimal working example:
#!/usr/bin/perl use warnings; use strict; use WWW::Mechanize; use File::Temp qw/tempfile/; my $url = "http://www.hiv.lanl.gov/content/sequence/HYPERMUT/hypermut. +html"; my $mechanize = WWW::Mechanize->new( autocheck => 1 ); $mechanize->get($url); my $seqfile = make_test_seqfile(); $mechanize->field( 'upfile1', $seqfile ); my $page = $mechanize->content; # save the page locally open my $FH, ">/tmp/hypermut.html"; print {$FH} $page; close $FH; warn "saved webpage data to /tmp/hypermut.html\n"; sub make_test_seqfile { my $testfile = File::Temp->new( UNLINK => 1, SUFFIX => '.fasta' ) or die "File::Temp: $!\n"; warn "opened $testfile"; print {$testfile} << 'END_TESTSEQS'; >HIV1-test.CONS ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC >HIV1-test.1 ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC >HIV1-test.2 ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC >HIV1-test.3 ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC >HIV1-test.4 ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC >HIV1-test.5 ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC END_TESTSEQS close $testfile; return($testfile . ''); }

Replies are listed 'Best First'.
Re: www::mechanize: second "submit" invisible to find_all_submits
by Anonymous Monk on Dec 08, 2014 at 07:28 UTC

        This excellent suggestion is a good start, but it doesn't get me there. I've altered the code to fit my current understanding, but my $mech object, even after the post, still represents the initial page, not the results page.

        I'm encouraged that I can in fact retrieve the initial page, but discouraged not to be able to get to the results (whether I upload the sequences as a file via the 'upfile1' attribute, or, as below, include them as a text field). Here's what I have:

        #!/usr/bin/perl use warnings; use strict; use WWW::Mechanize; my $hypermutUrl = "http://www.hiv.lanl.gov/content/sequence/HYPERMUT/hypermut.html"; my $mech = WWW::Mechanize->new( autocheck => 1 ); my $alignment_as_string = <<'END_SEQS'; >HIV1-test.CONS ATGGGATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAGAGATTTATTGTGTTC >HIV1-test.1 ATGGAATGTCTTGGAAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAAAGATTTATTGTGTTC >HIV1-test.2 ATGGAATGTCTTGGGAATCAGCTGCTTATCGCGCTCTTGCTAGTAAGTGCTTTAAAGATTTATTGTGTTC END_SEQS $mech->post( $hypermutUrl, 'FORMAT' => 'FASTA', 'ALIGNMENT' => $alignment_as_string, 'upfile1' => '', 'INN' => '', 'OUT' => '', 'MUTUPSTREAM' => '', 'MUTFROM' => 'G', 'MUTTO' => 'A', 'MUTDOWNSTREAM' => 'RD', 'ENFORCE' => 'DESCENDANT', 'CONTROLUPSTREAM' => '', 'CONTROLFROM' => 'G', 'CONTROLTO' => 'A', 'CONTROLDOWNSTREAM' => 'YN|RC', 'Analysis' => 'All', 'submit' => 'Run', ); my $page = $mech->content; open my $FH, ">/tmp/hypermut.html"; print {$FH} $page; close $FH; warn "saved webpage data to /tmp/hypermut.html\n";

        I would expect the post operation to add the page resulting from that post to my $mech object -- where am I going wrong?