alexnzus has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am currently working on an application which is supposed to retrieve some information from a website. I form a query, where I place the street address to get information on. The final part of the URL looks as follows:

../ListResults.aspx?State=TX&County=DALLAS&key=NTREIS&txtStreetNumberF +rom=1433&txtStreetNumberTo=1433&txtStreetName=LAMP+POST

When I check it in a browser (either Netscape or IE), it works fine. I receive a web page with the data I need. In my application though, the page contains no line I need. If I look for information on an address with a single word street name, e.g.

../ListResults.aspx?State=TX&County=DALLAS&key=NTREIS&txtStreetNumberF +rom=4533&txtStreetNumberTo=4533&txtStreetName=BELFORT
my application returns correct results. What could be wrong?

I don't provide the entire URL string, because the website requires the user agent to be a browser and has a cookie-based authentication mechanism.

You expert advice would be greatly appreciated.

Replies are listed 'Best First'.
Re: Need Help with LWP
by ikegami (Patriarch) on Oct 29, 2004 at 18:23 UTC
    Maybe the website is buggy and can't handle spaces escaped as '+'? Try substituting the '+'s with '%20's
      URI::Escape's uri_escape function handles these details for you.
        I'm afraid the problem is not in the URL string itself. I used "+" to concatenate all parts of the street name, because this is what the website code does. The user is supposed to fill in an online form, and click the "Search" button. The URL is displayed in the "Address" field of the browser. Actually, one can use "%20" instead of "+" and it works (in the browser). I just can't quite figure out why the website returns "0 records found" when the very same URL string is sent via HTTP::Request. As I mentioned before, everything works fine if the street name consists of just one word.
      Tried with exactly the same results. Thanks anyway.
Re: Need Help with LWP
by ww (Archbishop) on Oct 29, 2004 at 20:10 UTC
    Have you looked at the source?

    FWIW, this WAG, which may be way off base, but you may find the ampersands in the url are actually coded as character entities, " ampersand a m p semicolon " (spaces inside the quotes for legibility only) in the source.

      No, that is not the case, I am afraid.
Re: Need Help with LWP
by Anonymous Monk on Nov 01, 2004 at 00:29 UTC
    You could always sniff the wire with something like Ethereal to attempt to find any signifigant differences between the request that works (your browser) and the request that doesn't (LWP). It might seem like overkill but when you're stuck sometimes ya just gotta get down there and actually see what's going on.
      Sometimes you want to see the HTTP and HTML without running "tcpdump".

      If you are using windows I recommend a HTTP prxoy called Proxomitron. If you can't run that then HTTP::Proxy only takes a few lines of perl (see 373013).
        Those could definitely work. The same kind of functionality can still be achieved with Ethereal just by utilizing its filterings. A display filter of 'http' or a capture filter of 'tcp port 80' would do the trick nicely. That is assuming this is all regular http traffic.