Help with LWP GET request

mhearse has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Help with LWP GET request by Corion (Patriarch) on Jun 11, 2008 at 08:51 UTC
I can reproduce that it's not working: `Unquoted string "http" may clash with future reserved word at tmp.pl l +ine 2.` [download] This is most likely because you are missing a starting single quote in line 2. Maybe you want to elaborate further in what sense it is "not working". In the end, no server can discern whether it's a browser+human or a well-crafted Perl script that sends the requests over the wire, so your goal is to mimic what the combination of browser+human sends over the wire with your Perl script. To do that, you need tools to compare what your browser sends against what your Perl script sends. Here's my checklist of things to do while trying to scrape a website: Does the site work when manually browsing to it using a browser? Does the site use frames? If so, does the target frame page work when manually browsing to it using a browser? Does the site work when manually browsing to it with JavaScript disabled? What do the Firefox HTTP Live Headers output? Is the output of the HTTP Live Headers identical to what the wireshark tcp dump of your script says? My current web scraping tools are WWW::Mechanize for navigation (with WWW::Mechanize::Shell for quick exploration) and Web::Scraper for data extraction. Update: You changed your post to include a bit more of your code, but as you still don't tell us where and how it fails for you, and as it still contains the typo, that is of no more help than the previous content.	[reply] [d/l]
Re^2: Help with LWP GET request by mhearse (Chaplain) on Jun 11, 2008 at 09:03 UTC
Thanks for the reply. The missing single quote was left off when creating the post. It should be fixed now. I'm not really getting an error message. When I run the program, the html printed with `as_string()` doesn't contain the results for my query. If I simply paste the correct url into a browser (including the query string), the results are displayed. I will go over the things you suggested and see if I can get it working.	[reply] [d/l]
Re: Help with LWP GET request by derby (Abbot) on Jun 11, 2008 at 11:17 UTC
The extra parameters to LWP::UserAgent::get do not add URL parameters but HTTP header values -- it's not what you want. You need to construct the URL for a GET: `my $res = $ua->get( "http://www.airport-data.com/search/intl-airports. +php?field=code&kw=$arg" );` [download] -derby Update: As ikegami pointed out, this is not a complete solution. You do need to guard against bad input ... hmmm ... I'm going to start putting a standard disclaimer on my posts that all solutions are not complete and are only for illustrative purposes.	[reply] [d/l]
Re^2: Help with LWP GET request by ikegami (Patriarch) on Jun 11, 2008 at 12:52 UTC
That's buggy for many values of $arg. Fix: `use URI::Escape qw( uri_escape ); my $form_url = "http://www.airport-data.com/search/intl-airports.php"; my $res = $ua->get( "$form_url?field=code&kw=" . uri_escape($arg) );` [download]	[reply] [d/l]