scottknight has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I am trying to write a script that gets the content of a web page for a platform that has a pretty stripped Perl 5.8.0. LWP and CPAN are not included and I was hoping to not require or need to distribute LWP to do what I want, so I wrote this instead:
use IO::Socket::INET; use MIME::Base64 (); $pass64 = MIME::Base64::encode_base64("$username:$password"); my $sock; unless ($sock = new IO::Socket::INET (PeerAddr => $host, PeerPort => $ +port, Proto => 'tcp', Timeout => 5)) { $errormessage = $tr{'could not connect to http://$host:$port/$file +name'}; return '0'; } $sock->print("GET /".$filename." HTTP/1.0\r\n"); $sock->print("Host: $host\r\n"); $sock->print("Authorization: Basic $pass64\r\n"); $sock->print("\r\n"); while (<$sock>) { printf $_; } close($sock);
This has been working just fine until I came up against a D-Link modem that only spits back 501 Not Implemented messages at it. I found that it also kind of works on a Linksys wireless access point since it will happily get the root url, but a request for any other doc will return a 403 Error.

When using wget 1.6 or 1.9, I am always able to get all of the docs from all of the devices I try, so at least I have a benchmark.

I used ethereal to sniff the packets to see if I could find any difference between the wget and my script's traffic. There were only two differences:

    1) With wget, the Authorization: Basic string has the \r\n at the end, where the one I send with Perl only has \n. This should not be a problem with the D-Link since it doesn't require authentication. It's still an unexpected anomoly.

    2) With wget, the entire request shows up as a single packet in ethereal (GET, Host, User-Agent, Content-Type, etc). With Perl it shows a single packet that ends at the end of the GET line and a 'Continuation' packet that contains the rest of the request. The end of the packet that contains the GET, is just \r\n, so I cannot figure out why it is broken into two packets.

Anyone have any hints at all? (besides "Just use LWP")

Thanks.

Replies are listed 'Best First'.
Re: Problems with IO::Socket::INET
by dave_the_m (Monsignor) on Nov 29, 2004 at 15:14 UTC
    Your request is being broken up into separate packets because you are submitting it in multiple print statements.

    For portability, you shouldn't be using "\r\n" - this expands to different characters on different platforms; you should use "\015\012" instead.

    I'm not sure why you're losing the \r; it may be a binmode() thing depending on what platform you're using.

    Dave.

      dave_the_m,

      Thank you very much for your help. I suspected it was something with the fact that I was using print, but not that it was multiple print statements since the 'rest' of the request was contained in one packet (or so it seems, looking at ethereal output). At any rate, if I construct a single string and send that as a single print, it behaves the same way through ethereal. Now, I am getting a different error from the Linksys, but I suspect that is a different issue.

      As for the carriage returns, I was only using \r\n because that is what the examples I worked from used and the ethereal output from wget confirmed. I am working on Linux and have no current plans to port this to any other platforms, but you have piqued my interest. When I use \015\012 in place of the \r\n in my string, I get exactly the same results when looking at the packet....even the mysterious missing \r after the base64 encoded string.

      Off for more testing. Thanks again for your help.
        I just thought I would post a final follow-up to this thread. Thanks to the advice of dave_the_m, I have this working splendidly and figure a snippet of code might help someone who wants a lightweight and simpla alternative to LWP for whatever reason (mine is that Smoothwall doesn't have it and I don't want to distribute it). Here is code that will work ($host, $port, $filename, $username, $password are all just plain strings passed in from somewhere else):
        use IO::Socket::INET; use MIME::Base64 (); $pass64 = MIME::Base64::encode_base64("$username:$password", ""); my $sock; unless ($sock = new IO::Socket::INET (PeerAddr => $host, PeerPort => $ +port, Proto => 'tcp', Timeout => 5)) { $errormessage = $tr{'could not connect to http://$host:$port/$file +name'}; return '0'; } my $request = ""; my $eol = "\015\012"; $request = $request."GET /$filename HTTP/1.0".$eol; $request = $request."User-Agent: Perl/5.8.0".$eol; $request = $request."Host: $host:$port".$eol; $request = $request."Accept: */*".$eol; $request = $request."Connection: Keep-Alive".$eol; if ($auth_required) { $request = $request."Authorization: Basic $pass64".$eol; } $request = $request."$eol"; $sock->print($request); while (<$sock>) { printf $_; } close($sock);

        Obviously, in that while loop, you will want to do something more useful with the data than print it to the console :) I put each line into an array element.