in reply to Simultaneous writes and as-needed reads from sockets (or The State of Perl Threads...)

This sounds like a perfect job for IO::Select. Unfortunately I've yet to find a comprehensive tutorial for managing several simultaneous non-blocking IO::Socket connections, but I've written stuff to do this in the past. So I can't really recommend any reading for you except some standard texts on writing network code using non-blocking sockets.

Essentially, what I would do, is loop through the number of servers you have, create a new IO::Socket object, make it non-blocking, and dispatch a 'connect' request for it. Since it's non-blocking, you won't know if the connect succeeded or not until later. Repeat this for all of your sockets, and then enter a select loop (via IO::Select).

Keep data you're planning on sending in a buffer for that socket, select for writing those sockets that have data waiting to go out, write that data, and drop from the buffer the amount of data that was written.

Select for reading all of your sockets, process incoming data (being careful to preserve partial lines for next time), send it to whatever socket's input buffer it should go, etc.

Perhaps someone else can provide links to a good tutorial on building something like this.

In addition, migrating to an event-based architecture (such as POE) might be useful as well. I suspect a lot of this is "built-in".

  • Comment on Re: Simultaneous writes and as-needed reads from sockets (or The State of Perl Threads...)

Replies are listed 'Best First'.
Re: Re: Simultaneous writes and as-needed reads from sockets (or The State of Perl Threads...)
by deprecated (Priest) on Jan 13, 2001 at 03:38 UTC
    Hiya Fastolfe...

    I checked out POE (when you said architecture I got to thinking "hardware", actually its a perl module available from CPAN for anyone curious), and Im going to give it a lookover this evening.

    What you said about IO::Select is something everyone else has said to me. However, IO::Select is what the module is using. I actually am not doing any of the socketting. I wanted to be able to just launch a lot of processes and not deal with them until they had something to say. Like, for instance, in bash if i were do this:

    $ cat /etc/services & cat /etc/sendmail.cf

    is going to get me lines to the terminal from both files, roughly intermingled. I dont see why I shouldnt be able to do this from within perl, as what I am actually doing is pretty nonintense. If this clarifies things and youre able to suggest something else, by all means do. Otherwise I'm going to have a look over POE and see how relevant it is and whether I can actually grok it.

    Thanks again,
    deprecated.

    --
    i am not cool enough to have a signature.

      and just send a login request to all of them, and then deal with them as they get back to me, or, if necessary, destroy the connection. Instead, it takes 10 minutes to get to only 60 of the servers because I am waiting for every one of them.

      This doesn't sound like select-based behavior. Perhaps you're mixing select with standard blocking calls? A good non-blocking select-based implementation of something like this should be able to handle dozens of network sockets simultaneously without a significant amount of delay. If you're hitting multiple servers, you should be able to number your simultaneous connections in the hundreds and under any decent system, your bottleneck will be with your network connection, not the app (unless you're doing a lot of processing with the inbound data I guess).

      I mean, there are two ways you can go about this. I have no idea how Napster.pm does its thing. If you say it's working with select, fine. I don't get what the purpose of multiple threads is, in that case, but whatever. It's not important. With select, you work with Perl filehandles. These can be network sockets, files, STDIN/STDOUT, pipes, whatever. If you want to use open($S, "-|") to fork off a child process, IO::Select would be happy to use $S as a filehandle to watch. You can repeat this a dozen times to get the behavior you're looking for, with each child doing an exec or whatever it is you want. The select call will be happy to tell you which file handles have data waiting to be read.

      So basically, going back to your problem, it sounds like you neither want nor can code any direct hooks into the way this other module is doing its network handling. So realistically you can't make it "go faster" when it's working with multiple servers. What, then, do you plan to do? Do you want to fork off your process into 40 sub-processes, each one devoted to a single server? If so, select is still very much an option. Instead of selecting against network sockets, select against a pipes (such as that perlipc version of open above), and process inbound data from each of your children in turn.

      If something like this is the route you want to take, I highly recommend reading perlipc. You can always just fork and let each child write to STDOUT, but it's difficult to *capture* that information in a controlled way without using true pipes and mediating between them by using select, so that you can avoid blocking while waiting for data from one of them.

      If I've missed the boat on this, if your plan to break these tasks up is altogether more bizarre than anything I've mentioned, by all means let me know.