Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks
I am trying to create a socket and then trying to request
pages one by one( arnd 10 pages).
But only the first page contents is obtained and the rest are
not displayed at all. I am still not able to find out the reason.
When i give the pages individually, they are accessed properly
Any help

#!/usr/bin/perl -w ### Create a socket to read the pages from ISJ use IO::Socket; $sock = new IO::Socket::INET ( PeerAddr => 'www.somewhere.com', PeerPort => 80, Proto => 'tcp', Timeout => 10, ); die "Socket could not be created $!\n" unless $sock; open ( URLforSoc, "$FileCon" ) or die "Cant open the file for readi +ng!!!"; while ( <URLforSoc> ) { print $sock "GET ${URL_VAL} HTTP /1.0\n\n"; while($line = <$sock>) { print "$line"; } $sock->flush(); } close ($sock); exit 0;

Title edit by tye

Replies are listed 'Best First'.
Re: sockets
by rjray (Chaplain) on Mar 20, 2003 at 03:11 UTC

    First and foremost, don't re-invent the wheel. Unless you have a specific reason for not doing so, you should be using LWP::UserAgent for this.

    The short answer to the "why?" part of your question is this: HTTP/1.0 works on a single request/response model. With HTTP/1.1, you could use Keep-Alive to work around this. But generally, for simple HTTP requests you go one at a time.

    A great advantage to the LWP set of classes is that the user-agent can be configured to try and use keep-alive, falling back to single-transaction model if needed. Really, you are much better off using this.

    --rjray

Re: sockets
by pg (Canon) on Mar 20, 2003 at 06:01 UTC
    This is what others mentioned:
    use LWP::UserAgent; use HTTP::Request; my $agent = new LWP::UserAgent(); while (<DATA>) { chomp; my $res = $agent->request(new HTTP::Request(GET => $_)); if ($res->is_success) { print $res->content; } else { print "failed to get $_\n"; } } __DATA__ http://www.yahoo.com http://www.cnn.com http://www.somethingnotexists.abc


    Well, there is no harm to play with socket, so here you are (pay attention to that comment I made in the following demo):
    use IO::Socket; while (<DATA>) { chomp; print "try to connect to host $_ ...\n"; $sock = new IO::Socket::INET(PeerAddr => $_, PeerPort => 80, Proto => 'tcp', Timeout => 10 ); if (!defined($sock)) { print "failed to connect to host $_\n"; next; } else { print "connected to host $_\n"; } print $sock "GET http://$_ HTTP/1.1\r\n" . "Host: $_\r\n\r\n"; print $_ while (<$sock>); #this will fail to end, if peer does not + close the connection close($sock); } __DATA__ www.yahoo.com www.cnn.com www.somethingnotexists.abc
Re: sockets
by Zaxo (Archbishop) on Mar 20, 2003 at 02:54 UTC

    ${URL_VAL} is never defined. Do you mean:        print $sock "GET ${_} HTTP /1.0\n\n";

    Why not use LWP::UserAgent for this?

    After Compline,
    Zaxo

      i am sorry i missed that line,
      This line is present in there just after the while

      $URL_VAL=$_;
Re: socket won't let me fetch multiple web pages (HTTP v1.1)
by tye (Sage) on Mar 20, 2003 at 15:45 UTC

    Some good answers already and pg hinted at part of the problem.

    Your loop will not end until end-of-file. Once you get end-of-file, you can't get more data.

    A second part of the problem is that you are specifying HTTP v1.0 which doesn't support multiple requests per connection. If you specify HTTP v1.1, then you could to multiple requests per connection but you'll have to change your loop to not go all the way to end-of-file. You may also need to not use the read-line operator (<$sock>) since it will wait forever looking for a record separator and I'm not certain that HTTP v1.1 responses are guaranteed to end in a newline.

    If you really want to do this, then you'll need to learn the HTTP v1.1 protocol. You can look at how LWP does this as one step in that process. Searching for "1.1" I found that LWP/Protocol/http.pm has some of the code that handles this. A simple google search, http 1.1 standard, will find you the full details of the HTTP v1.1 protocol.

    Or you could just use a module that has already done all of this work for you as several others have already suggested.

                    - tye