Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello guys, sorry for asking this question, but i have been for the past two hours on the net, and i dont know what to do. I want to do some socket programming so that i can read in a web page and then analyze its contents. I dont have Socket installed. I would like to know where do i get and install it. I have a linux machine(7.3). Thanks in advance

Replies are listed 'Best First'.
Re: IO::Socket
by diotalevi (Canon) on Mar 18, 2003 at 12:03 UTC

    Use LWP and ignore all the socket business. I've copied an example right from the documentation I linked to.

    # Create a user agent object use LWP::UserAgent; $ua = LWP::UserAgent->new; $ua->agent("MyApp/0.1 "); # Create a request my $req = HTTP::Request->new(POST => 'http://www.perl.com/cgi-bin/Bu +gGlimpse'); $req->content_type('application/x-www-form-urlencoded'); $req->content('match=www&errors=0'); # Pass request to the user agent and get a response back my $res = $ua->request($req); # Check the outcome of the response if ($res->is_success) { print $res->content; } else { print "Bad luck this time\n"; }
Re: IO::Socket
by Corion (Patriarch) on Mar 18, 2003 at 12:13 UTC

    If all you need to do is simple web scraping, WWW::Mechanize and HTML::TableExtract might be of interest to you as well. WWW::Mechanize offers you a browser-like API, while HTML::TableExtract offers you an easy way to extract rows from HTML tables.

    A module written by me is WWW::Mechanize::Shell, which offers you a command line browser that can output a Perl program recreating all actions done by you.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
•Re: IO::Socket
by merlyn (Sage) on Mar 18, 2003 at 12:38 UTC
    If you don't have IO::Socket, you have either a very old Perl, or a very broken Perl. It's unlikely that any other comment in this thread will be useful until you remedy that situation first. You can install an entire modern Perl in your home directory. I know, I had to do that at one of my prior ISPs. (Now I'm my own ISP. {grin})

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: IO::Socket
by l2kashe (Deacon) on Mar 18, 2003 at 14:19 UTC
    All the other suggestions in this thread are right on, but I'll answer the question in a slightly different way..
    use IO::Socket; $www = IO::Socket::INET->new( PeerAddr => 'some_host_or_ip', PeerPort => '80_or_whatever_port_http_is_on', Proto => 'tcp', Type => SOCK_STREAM ) || die "Cant connect to some_host_or_ip\n$!\n"; print $www "GET / HTTP/1.0\n\n"; while (<$www>) { push(@response, $_); } close($www); print "Got Back\n"; for (@response) { print; }
    That should be syntactically correct. I didn't use strict, I also didn't take into account whether the HTTP server did or did not respond, so that while(<$www>) would block forever. Also I didn't actually remove the HTML tags returned. All these reasons and a few more are why people are pointing you to modules as opposed to simply answering the question you asked.

    Happy Hacking

    /* And the Creator, against his better judgement, wrote man.c */
Re: IO::Socket
by Anonymous Monk on Mar 19, 2003 at 01:43 UTC
    Thanks for all the replies. Looks like there is still quite a lot in store for me.
    Thanks