Okay, maybe not really a Perl question, per-se... But here goes:

I have a small program (gethttp) to make simple HTTP requests and print the response to stdout. Unfortunately, LWP is not available so it had to be done manually using IO::Socket. I copied the program from one of the Perl man pages and modified it only slightly.

For the most part, this program works just great. Every once in a while, however, I find a page (usually a CGI) that doesn't seem to work. What I get back is a 404 Error though I know the page is there because I can access using my web-browser.

I figure, there must be something going on that I'm just not getting. I can't find any documents anywhere explaining a different syntax for the HTTP GET, and I can't see anything wrong with my Perl code. I really just want to understand why it's not working and what's going on, though it might have a practical application in a project I'm working on if I can get it to work.

I'm including the code from my program below, as well as a URL that it doesn't work on (I don't know if everyone can get to the URL, since it might be set up private to UF. Let me know if you have problems.) and the response I get from them.

I would appreciate any help very much!

The Program...

#!/usr/bin/perl -w use IO::Socket; unless (@ARGV) { die "usage: $0 URL\n" } $EOL = "\015\012"; $BLANK = $EOL x 2; $sep = (@ARGV > 1) ? "-------------------\n" : ""; foreach $url ( @ARGV ) { unless($url =~ m{^http://(.*?)/}) { print "$0: invalid url: $url\n +"; next } $host = $1; $remote = IO::Socket::INET->new( Proto => "tcp", PeerAddr => $host, PeerPort => "http(80)", ); unless ($remote) { die "Cannot connect to http daemon on $host\n" +} $remote->autoflush(1); print $remote "GET $url HTTP/1.0" . $BLANK; while ( <$remote> ) { print } print "\n$sep"; close $remote; }
The Response
$ ./gethttp 'http://login.gatorlink.ufl.edu/authenticate.cgi' HTTP/1.0 404 Not Found Date: Fri, 12 Jan 2001 22:57:21 GMT Server: Apache/1.3.6 (Unix) mod_perl/1.19 mod_ssl/2.2.8 OpenSSL/0.9.2b Connection: close Content-Type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>404 Not Found</TITLE> </HEAD><BODY> <H1>Not Found</H1> The requested URL http://login.gatorlink.ufl.edu/authenticate.cgi was +not found on this server.<P> </BODY></HTML>

In reply to HTTP GET without LWP by bbfu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.