in reply to Rapid inter-machine communication on internal network
Obviously, that's incomplete and totally untested. And you shouldn't really be calling sysread or syswrite without checking the return value. You might want to use something else for reading and/or writing so you don't have to deal with short reads/writes. And, of course, if a server goes down, you probably don't want to hang forever waiting for it to return a response (especially if you got a zero-length read from it that tells you it shut down.) You might also want to use IO::Select instead of raw select, and you might want to store your servers in a hash or array keyed by the file descriptor, etc.our @servers; our $server_fdset = ''; # Fill in the IP addresses/names of all of the servers sub setup { for my $server (@servers) { $server->{socket} = IO::Socket::INET->new(Proto => 'tcp', PeerAddr + => "$server->{host}:$server->{port}") or die "connection to server $server->{name}\n"; # In reality, you should do something smarter if a server fails -- + return an incomplete result, or have redundant servers vec($server_fdset, fileno($server->{socket}), 1) = 1; } } sub query { my ($what) = @_; for my $server (@servers) { $server->{socket}->syswrite("how many $what you got, dude"); } my $num_bananas = 0; my $done = 0; my $fds = $server_fdset; while ($done < @servers) { my $rfds; select($rfds = $fds, undef, undef, undef); for my $server (@servers) { if (vec($rfds, fileno($server->{socket}), 1)) { my $buf; $server->{socket}->sysread($buf, 4); my $result; unpack("L", $buf); $num_bananas += $result; # Done with this server $done++; vec($fds, fileno($server->{socket}), 1) = 0; } } } # Repeat, using the new request }
Another thing I did was to duplicate each server. So every machine was responsible for two different partitions of the data. It helped for failover, but I also got (perhaps excessively) clever: I would first send out a query for the "main" partition for each server. Then, if server A (handling partitions 1 and 6) returns a result for partition 1 before server B (handling partitions 6 and 3) returns its result for partition 6, I'd send the query to server A for partition 6. It tangles up the logic a bit, because you have to keep track of state for each server. But if your response time variance is high, you can speed things up quite a bit. (You could obviously send both requests to each server initially, but that tends to bog down the server farm. Depends on what your load looks like -- it trades capacity for latency.)
I had some even wackier stuff, so I ended up abstracting it out some: each server structure had a code ref storing its next action, and a slot for storing a description of what it should wait for before executing that next action. (The exact pattern of queries was dependent on the results of earlier queries, sometimes just from the same server, sometimes from others.)
Finally, it's nice to keep a timestamp of the transmission of every request and the response's arrival. I wrote a little app that took all of those timestamps and graphed them so you could immediately see where the "long pole" was, and if the pattern looked more or less correct. It was useful in debugging as well as optimization. And it looked cool, especially since it picked a different color for every request..response line.
Today, I'd probably take the plunge and learn POE.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Rapid inter-machine communication on internal network
by jmuhlich (Acolyte) on Oct 31, 2006 at 08:35 UTC | |
by sfink (Deacon) on Nov 03, 2006 at 06:43 UTC | |
|
Re^2: Rapid inter-machine communication on internal network
by jmuhlich (Acolyte) on Oct 31, 2006 at 08:38 UTC |