I am trying to automatically retrieve information from on online database. So far I fill in the forms and submit them using WWW::Mechanize. The returns are on several pages. And that's where the problems start: I cannot navigate between them to look at the other pages (and download them as well). Thus, so far I only get the first of up to twenty pages of results.
The database search engine can be found at http://europa.eu.int/prelex/rech_avancee.cfm?CL=en. After filling in series (COM) and year (e.g., 1999) the form is submitted and the first of twenty pages of results is displayed. I save this page and would like to go to the next one, however, I do not know how to manipulate this kind of navigation bar (the 1-20 fields). Their html code looks like this:
I tried to go to the href-link directly, but then only an empty form is displayed.<A HREF="liste_resultats.cfm?PCP=1&CL=en" ONMOUSEOUT="isimgact( 'btn_n +av_pin52', 'btn_nav_pinoff')" ONMOUSEOVER="isimgact( 'btn_nav_pin52', + 'btn_nav_pinon')"><IMG src="images/btn_pin.gif" BORDER="0" HEIGHT="1 +7" WIDTH="18" NAME="btn_nav_pin52" ALT="COM (1976) 728 - COM (1976) 6 +97-3"></A></td>
Here is my code:
Help would be greatly appreciated. I looked at the descriptions for MECHANIZE, but they only mention "regular" buttons.#!/usr/bin/perl -w use strict; use WWW::Mechanize; use LWP::Simple; my $agent = WWW::Mechanize->new(); $agent->get("http://europa.eu.int/prelex/rech_avancee.cfm?CL=en"); $agent->form(2); $agent->field("clef2", "1999"); $agent->field("clef1", 'COM'); $agent->field("nbr_element", '99'); $agent->click(); my @pcp=(1, 100, 199, 298, 397, 496, 595, 694, 793, 892, 991, 1090, 11 +89, 1288, 1387, 1486, 1585, 1684, 1783, 1882); my $pcp; foreach $pcp (@pcp) { my @input; @input=get("http://europa.eu.int/prelex/liste_resultats.cfm?PCP=$pc +p\&CL=en"); my $input; foreach $input (@input) { open RESULTS, ">>C:/programme/perl/test/result.txt"; print RESULTS "$input\n"; close(RESULTS); } }
In reply to WWW::Mechanize and Navigation by New Novice
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |