Re: Handling multiple clients
by graff (Chancellor) on Sep 05, 2004 at 03:48 UTC
|
I wasn't sure myself, so I just did a simple-minded test, and sure enough, when the child starts up, it takes up as much memory as the parent, which means that you're getting a full copy of your 2gb in-memory data each time you fork. Forking 5 children would pretty much guarantee that the OS will need to do a lot of memory swapping to run all those huge child processes. I think the delays you're seeing are not so much the cpu load of the children, but rather the i/o wait imposed by swapping. (Some versions of "top" will report the total percentage of processing time devoted to "i/o wait" -- if your version of "top" shows that, you'll probably see it skyrocket).
If you want some sort of approach that actually shares a single copy of the 2GB data set among multiple clients that are being served simultaneously, I think you'll need threads rather than forking. I'm not a reliable source on this, 'cuz I've never used threads myself, but... if I'm not mistaken (no guarantee on that), one of the advantages of threading is that you really can share a single store of in-memory data across threads, whereas you can't do that across children forked from a given parent. I hope others can elaborate from personal experience...
Meanwhile, you may want to reassess your requirements. How important is it, really, for multiple clients to be serviced in parallel (given that doing so might not be doable without a serious loss of efficiency)? Is there any chance the process could work from a mysql database, rather than from in-memory storage? (Multiple concurrent access to a 2gb dataset is a lot easier to implement efficiently using a real RDBMS, and mysql is pretty zippy for a lot of tasks.)
| [reply] |
|
|
What operating system do you use and how did you measure memory usage? I expect anything decent to share all of the pages, marking them Copy-on-Write.
As far as I understand Perl threads, every new interpreter copies everything not explicitly shared. I'd expect that to do even worse for the poster's question.
| [reply] |
|
|
perl -e '$|=1; @a=(0..10_000_000);
$child = fork();
die "fork failed\n" unless (defined $child);
print "parent = $$\nchild = $child\n" if $child;
sleep 30'
and while that was running, do "top" in another window; both processes showed up with the same size.
I expect anything decent to share all of the pages, marking them Copy-on-Write.
I guess I'd want to test different cases, with different amounts of data and a more realistic set of operations, to see whether I get what you expect. (I probably won't do that, actually -- it's not the sort of thing I need...)
As far as I understand Perl threads, every new interpreter copies everything not explicitly shared. I'd expect that to do even worse for the poster's question.
Thanks for the clarification about threads. I'll grant that my experience with the concept of data sharing across processes is limited. (I'm sure I studied the C functions that create shared memory in Solaris years ago -- and I might even have used them a couple times...) As for threads, I might use them some day, and till then, I guess I should keep my mouth shut about them.
(update: ...um, if the OP happens to have 2GB organized into a few hefty data structures, and those are explicity shared, why would that be worse than forking? Are the methods for declaring what is shared really unpleasant, or something?) | [reply] [d/l] |
|
|
|
|
|
|
|
|
|
RedHat 9 with the latest updates before they stopped updating.
| [reply] |
|
|
I didn't think it was relevant, but I am using Net::Patricia for my data storage.
| [reply] |
|
|
| [reply] |
Re: Handling multiple clients (use threads)
by BrowserUk (Patriarch) on Sep 05, 2004 at 11:50 UTC
|
Provided that you create the threads before loading your data, a threaded server works fine. Only the main thread has a copy of the large volume of data, whilst sharing the requests and replies with the server threads through shared memory (Thread::Queue):
#! perl -slw
use strict;
use IO::Socket;
use threads qw[ yield ];
use threads::shared;
use Thread::Queue;
$| = 1;
our $THREADS ||= 5;
my $listening : shared = 0;
our $ios = IO::Socket::INET->new(
LocalPort => 6969,
Type => &IO::Socket::SOCK_STREAM,
Proto => 'tcp',
Reuse => 1,
Listen => 100,
) or die "IO::S->new failed with $!";
print "$ios";
sub server {
$listening++;
my( $Qquery, $Qreply ) = @_;
my $tid = threads->self->tid;
print "tid:$tid";
## Give th other threads a chance to get up and running.
yield until $listening == $THREADS;
while( my $client = $ios->accept() ) {
chomp( my $query = <$client> );
# print "$tid: $client got: '$query'";
$Qquery->enqueue( "$tid:$query" );
my $reply = $Qreply->dequeue();
print $client $reply;
close $client;
}
$listening--;
}
my @Qs = map{ new Thread::Queue } 0 .. $THREADS;
threads->new( \&server, $Qs[ 0 ], $Qs[ $_ ] )->detach for 1 .. $THREAD
+S;
yield until $listening == $THREADS;
print "Threads $listening running; grabbing data";
open BIGFILE, '< :raw', 'data/50MB.dat' or die "data/50mb.dat: $!";
my $data;
sysread( BIGFILE, $data, -s( BIGFILE ) ) or die "sysread BIGFILE : $!"
+;
close BIGFILE;
while( $listening ) {
my( $tid, $msg ) = split ':', $Qs[ 0 ]->dequeue();
## Process request
print "Received '$msg' from $tid";
$Qs[ $tid ]->enqueue( 'Thankyou for your enquiry' );
}
Partial server log
Client: #! perl -slw
use strict;
use IO::Socket;
my $socket = IO::Socket::INET->new(
PeerAddr => '127.0.0.1',
PeerPort => 6969,
Proto => "tcp",
Type => SOCK_STREAM
) or die "Couldn't connect to 127.0.0.1:6969 : $@";
# ... do something with the socket
print $socket "Why don't you call me anymore?";
chomp( my $answer = <$socket> );
print "Got: $answer";
# and terminate the connection when we're done
close($socket);
As is, the server doesn't contain any mechanism for shutting it down, but ^C works okay and a SIGINT handler could deal with cleanup if required.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Handling multiple clients
by lidden (Curate) on Sep 05, 2004 at 01:27 UTC
|
Are you handling your dead children? That is catching SIGCHLD. Maybe somthing like:
$SIG{CHLD} = 'IGNORE';
will help you. Although you may want to to somthing better then just ignore them.
| [reply] [d/l] |
|
|
Yes, I added the cookbook code to my original post.
| [reply] |
Re: Handling multiple clients
by johnnywang (Priest) on Sep 05, 2004 at 07:00 UTC
|
I just wrote a little multi-client server at work. It's in a request-response type of situation, i.e., the connections are short lived. Instead of forking, I used threads. One does need to share data explicitly. Some code samples are as follows: (any comments are appreciated, since it wasn't used in a heavy production environment, so I'm not sure how it scales.)
use strict;
use threads;
use threads::shared;
use IO::Socket;
# need to explicitly share variables across threads
our $something_to_share;
share($something_to_share);
my $socket = new IO::Socket::INET(
LocalPort=> 11023,
Proto => "tcp",
Listen => 10,
Reuse => 1)
or die "Socket could not be created, reason: $!";
while( my $client = $socket->accept() ){
threads->new(\&handler, $client)->detach();
}
exit(0);
sub handler{
my $client = shift;
#read request.
my $input = <$client>;
# do something, probably access the
# the shared variable $something_to_share;
# and send response back.
print $client "something for you.";
}
| [reply] [d/l] |
Re: Handling multiple clients
by kscaldef (Pilgrim) on Sep 05, 2004 at 06:07 UTC
|
I've found that the Cookbook code for dealing with SIGCHLD seems to be a bit unreliable (See $? is -1???). As best I could ever figure, despite documentation to the contrary, it appeared that the signal handling didn't actually handle reentrancy correctly.
However, I think you could get rid of the signal handler and just do something like replace
# And maintain the population.
while (1) {
sleep; # wait for a signal (i.e., child's
+ death)
for ($i = $children; $i < $PREFORK; $i++) {
make_new_child(); # top up the child pool
}
}
with
# And maintain the population.
while ((my $pid = waitpid(-1)) > 0) {
$children--;
delete $children{$pid};
make_new_child()
}
| [reply] [d/l] [select] |
|
|
hi jalewis2,
I am not much of a perl programmer till now (learning learning...).
But if i were to do it in C, i would have a server process listeneing to the request. Since the main task seems to be data query it would load the data in the memory in the beginning. Upon recieving a request it will call a function which will return after forking a child ( worker) thread, which will do the work and exit. You can maintain the thread count in the server process. I don't think preforking is a good idea as it will keep consuming resource even if there is no work to do.
| [reply] |
|
|
Depending on the type of server you are writing, you may not want the expense of forking for each request. There is certainly a movement in OS design to make forking cheap, but you still shouldn't assume it is without cost. If you typically have short sessions that require high performance and low latency, you probably want to prefork children.
I'm not sure what resources you are worried about the children consuming. COW implementations of forking mean that the children will use only minimal additional memory unless they have to. If the children are simply blocking on a select call, they won't be using any significant amount of CPU either.
| [reply] |
Re: Handling multiple clients
by quai (Novice) on Sep 05, 2004 at 09:14 UTC
|
Take a look at "17.13. Non-Forking Servers" in Perl Cookbook from O'Reilly.
"Problem:You want a server to deal with several simultaneous connections, but you don't want to fork a process to deal with each connection." | [reply] |
Re: Handling multiple clients
by zentara (Cardinal) on Sep 05, 2004 at 13:09 UTC
|
I can't find the node offhand, but I seem to remember a variant of this question being asked recently, and the best advice was to load the "huge data" into a ramdisk, so everything can have access to it, without having to worry about "sharing" it across clients, in your code. Or use a database. Just a thought.
I'm not really a human, but I play one on earth.
flash japh
| [reply] |
Re: Handling multiple clients
by mkirank (Chaplain) on Sep 07, 2004 at 15:12 UTC
|
2 GB of data in memory will slow down your system and also your application , Use DBD::Sqlite2 u will not have problems of installation of any database and can speed up your process also
| [reply] |