in reply to lwp not retieving the same page as from a browser

Define 'pull down the page'. When I connect to that page without running JavaScript I see no data, but if i look at the source some (perhaps all) of it is there. The Javascript hides and unhides the various data parts. What are you using to parse the HTML?
  • Comment on Re: lwp not retieving the same page as from a browser

Replies are listed 'Best First'.
Re^2: lwp not retieving the same page as from a browser
by zuma53 (Beadle) on Aug 26, 2009 at 07:15 UTC
    pull down the page = what is returned from the Get request

    I turned off Javascript on the browser and the 'missing' data is present in the returned page (i.e. it has no effect on what gets returned; it still gets more data than via perl).

    I guess the best way I can describe this is:

    Browser:
      Headers + Get => AxyzB

    Perl:
      Headers + Get => AxB

    where ABxyz are sections of HTML returned. xyz are sections associated with the tabbed areas.

    I am sending the same headers in perl (as far as I know) that were sent/shown via rexswain.com.

      If I change your code to this (changing your User-Agent to the one used by rexswain.com), and using the normal call to set user-agent, viz:
      use LWP::Simple; use LWP::UserAgent; $browser = LWP::UserAgent->new(); $browser->agent('Mozilla/5.0 (X11; U; OpenBSD i386; en-US; rv:1.8.1.22 +) Gecko/20090626 SeaMonkey/1.1.17 XpcomViewer/0.9'); $response = $browser->get('http://brtweb.phila.gov/brt.apps/Search/Sea +rchResults.aspx?id=6546003202'); print $response->content;

      I then get the same amount of lines and text as rexswain.com does, I have not verified the content, can you check? Using your User-Agent string returns a 41437-byte response, and the rexswain User-Agent (used above) returns 43314 bytes, which is the same as the rexswain.com form returns. Perhaps sending Mozilla/4.0 instead of 5.0 was triggering some code path on their ASP code you would not see otherwise.
        Yes, that did the trick! I would never have thought of that.

        Thank you for your help.