"use IO::Socket;
use Socket;
What would be the difference?"
While I could repeat back to you what perl docs says, I imagine you could do that yourself. Instead, I found something which may be of use here at perl monks.
http://www.perlmonks.org/?node_id=104273

SOCKET - This field contains a pointer to an already existing socket? Huh?
No, you provide the filehandle you want to use to reference the socket that this function will make. It's similar to open

PF_INET - Could also be AF_INET. Either Address Family or protocol family. What would that mean?
Ok, so in today's networking programming world, there is no difference (says that with his fingers crossed or whatever kids do nowadays to indicate they're lying). Use AF since it's the commonly accepted method of doing things now. Not that it matters. In the socket.h which socket.pm is based off of, they're defined to be the same thing#define PF_INET AF_INET

SOCK_STREAM - Going through some existing crawler code, I couldn't even locate where this stream came from. Is it there by default?
This indicates what type of sock you want to create. SOCK_STREAM indicates tcp/ip. This is what you want for a webcrawler. The other options can be found here http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.progcomm/doc/progcomc/skt_types.htm

Than, in some places i've read that I would need to run bind(SOCKET,ADDRESS) to assign an ip to the socket.(My ip i guess), but in the example im working with this isnt included. Where is source IP assigned than?
Bind is only needed if the socket you're using is going to start listening. You need to bind the socket to the address and port that that you will be listening on. Otherwise if the socket is going to be used for connecting, you don't need bind, you need connect. When you bind, you choose the address and port, it doesn't "come" from anywhere.

Im left wondering, what about the TCP handshake? Seems we can just skip it and ask for the resource off the bat from the server. Is that always the case?
You're under the impression that sockets operate on the transport level. That is incorrect(to a certain degree). It operates on the level above that,the application level. I say this because when you use send, the data you send is all contained in the payload portion of the packets. You don't specify any of the tcp headers. Thus as a programmer using sockets, you're operating on the application level. When you declared this socket to be a tcp/IP socket, the connect statement takes care of the initial handshake and syncs. If you want to verify this for yourself, run wireshark and view the handshake.

As to your final question regarding storage of web pages, I think I'll leave that question to another fellow monk since my area of expertise is networks. Hope this information helped.

In reply to Re: some SOCKET action by fidesachates
in thread some SOCKET action by Sary

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.