in reply to Help with LWP GET request

I can reproduce that it's not working:

Unquoted string "http" may clash with future reserved word at tmp.pl l +ine 2.

This is most likely because you are missing a starting single quote in line 2.

Maybe you want to elaborate further in what sense it is "not working".

In the end, no server can discern whether it's a browser+human or a well-crafted Perl script that sends the requests over the wire, so your goal is to mimic what the combination of browser+human sends over the wire with your Perl script. To do that, you need tools to compare what your browser sends against what your Perl script sends.

Here's my checklist of things to do while trying to scrape a website:

  1. Does the site work when manually browsing to it using a browser?
  2. Does the site use frames? If so, does the target frame page work when manually browsing to it using a browser?
  3. Does the site work when manually browsing to it with JavaScript disabled?
  4. What do the Firefox HTTP Live Headers output?
  5. Is the output of the HTTP Live Headers identical to what the wireshark tcp dump of your script says?

My current web scraping tools are WWW::Mechanize for navigation (with WWW::Mechanize::Shell for quick exploration) and Web::Scraper for data extraction.

Update: You changed your post to include a bit more of your code, but as you still don't tell us where and how it fails for you, and as it still contains the typo, that is of no more help than the previous content.

Replies are listed 'Best First'.
Re^2: Help with LWP GET request
by mhearse (Chaplain) on Jun 11, 2008 at 09:03 UTC
    Thanks for the reply. The missing single quote was left off when creating the post. It should be fixed now. I'm not really getting an error message. When I run the program, the html printed with as_string() doesn't contain the results for my query. If I simply paste the correct url into a browser (including the query string), the results are displayed. I will go over the things you suggested and see if I can get it working.