bravesoul has asked for the wisdom of the Perl Monks concerning the following question:

Hi all and the great website, I am new here and I'm having trouble with the following code, I have mark the problem with 'fix me, ' in the comment. Would you please give me some suggestion? Many thanks.
#!/usr/bin/perl # created by joe.zheng at Wed Apr 1 10:23:14 2009 use 5.008; use threads; # multi-threads needed use threads::shared; # need to share some variables use Thread::Queue; # thread safe communication needed use strict; use Getopt::Long; use HTTP::Daemon; use Time::HiRes qw( time ); # If available.... # sub proto type declaration sub trc(@); # print log message sub notify(@); # parameter: class, payload ... sub init_srv(); sub spawn_miner(); sub spawn_worker(); # global variables my $queue_up = Thread::Queue->new; my %stat : shared; my %data : shared; # configuration my %cnf = ( # Username/Password. usr => 'usr', pwd => 'pwd', # Appr. Position, being used for Ephemeris transfer from server. lat => 47.28, # Degrees lon => 8.56, # Degrees acc => 1000, # meters srv => 'agps.u-blox.com:46434', cmd => 'eph', # our server settings port => 46434, # listen port queue => 5, # listen queue timeout => 10, # rest => 10, # the time value to rest after previous conn +ect from the same addr conn => 3, # the allowed max connection per addr max_con => 20000, # max connection simultaneously ext_srv => '', # extent server to handle requests when t +he main server reach its capacity workers_ini => 8, # prethread count workers_lo => 2, # low water mark workers_hi => 10, # high water mark period => 600, # period to check for new ephemeris # directory doc => 'doc', # the root directory where all the docs live in # misc dbg => 1, # print log for debug purpose ); # parse command line parameter GetOptions(\%cnf, 'usr=s', 'pwd=s', 'lat=f', 'lon=f', 'acc=i', 'srv=s', 'cmd=s', 'period=f', 'port=i', 'queue=i', 'timeout=f', 'rest=f', 'conn=i', 'max_con=i', 'ext_srv=s', 'doc=s', 'dbg=i', ); # fix me, not work in win32, why? #$SIG{'INT'} = $SIG{'TERM'} = sub { lock %stat; $stat{'done'}++ }; my $listener = init_srv(); spawn_worker() foreach ( 1 .. $cnf{'workers_ini'} ); spawn_miner(); trc 'dbg', 'start main loop'; while ( !$stat{'done'} ) { if ( my $msg = $queue_up->dequeue() ) { trc 'dbg', 'new msg'; my ( $class, $payload ) = split /\s+/, $msg, 2; if ( $class eq 'work' ) { my ( $t, $s ) = split /\s+/, $payload, 2; lock %{ $stat{'work'} }; trc 'dbg', 'work stat', $t, $s; if ( $s eq 'gone' ) { delete $stat{'work'}{$t}; } else { $stat{'work'}{$t}{'stat'} = $s; $stat{'work'}{$t}{'time'} = time(); $stat{'work'}{$t}{'work'}++ if $s eq 'idle'; } } elsif ( $class eq 'data' ) { my ( $t, $d ) = split /\s+/, $payload, 2; if ( $d ) { lock %{ $stat{'data'} }; trc 'dbg', 'data', $t, length $d; $stat{'data'}{$t}{'expi'} = time() + $cnf{'period'}; $stat{'data'}{$t}{'data'} = $d; } } else { # nothing to do } } # reap worker if necessary # adjust worker my @worker_idle = sort { $a <=> $b } grep { $stat{'work'}{$_}{'sta +t'} eq 'idle' } keys %{$stat{'work'}}; trc 'dbg', 'idle worker', scalar @worker_idle; if ( @worker_idle < $cnf{'workers_lo'} ) { spawn_worker() for ( 1 .. $cnf{'workers_lo'} - @worker_idle ); } elsif ( @worker_idle > $cnf{'workers_hi'} ) { foreach my $t ( @worker_idle[ 0 .. @worker_idle - $cnf{'worker +s_hi'} - 1 ] ) { lock %{ $stat{'work'} }; $stat{'work'}{$t}{'stat'} = 'done'; } } } sub END { # close all the worker foreach ( keys %{$stat{'work'}} ) { lock %{ $stat{'work'} }; $stat{'work'}{$_}{'stat'} = 'done'; } } sub retrieve_data { # dummy data here return 'dummy data, comming soon'; } sub trc (@) { return if !$cnf{'dbg'} && $_[0] eq 'dbg'; local $, = ', '; print scalar(localtime), @_; print "\n"; } sub init_srv () { # print out the configuration for debug purpose if ( $cnf{'dbg'} ) { print '<configuration>', "\n"; foreach my $k ( sort keys %cnf ) { print $k, ': ', $cnf{ $k }, "\n"; } print '</configuration>', "\n"; } # main thread my $srv = HTTP::Daemon->new( LocalPort => $cnf{'port'}, Reuse => 1, Listen => $cnf{'queue'}, Timeout => $cnf{'timeout'}, ) or die "can't create local socket: $@\n"; trc 'log', "Accepting connections on", $srv->sockhost().':'.$srv-> +sockport(); { lock %stat; $stat{'lock'} = &share( {} ); $stat{'data'} = &share( {} ); $stat{'work'} = &share( {} ); } return $srv; } sub spawn_miner() { sub do_miner { # wait for the main thread finish initialization # fix me, yield can't achieve our purpose, why? # threads->yield(); # periodically check for ephemeris while ( !$stat{'done'} ) { # Check for new ephemeris notify( 'data', 'eph', retrieve_data() ); select undef, undef, undef, $cnf{'period'}; } } if ( my $thread = threads->new( \&do_miner ) ) { lock( %{ $stat{'data'} } ); $stat{'data'}{'eph'} = &share( {} ); $thread->detach; } } sub spawn_worker() { sub do_worker { my $t = threads->self()->tid(); # wait for the main thread finish initialization # fix me, yield can't achieve our purpose, why? # threads->yield(); sleep(1); my $c; LOOP: while ( !$stat{'work'}{$t}{'stat'} ne 'done' ) { { lock %{ $cnf{'lock'} }; $c = $listener->accept() or next LOOP; } notify( 'work', $t, 'busy' ); trc 'log', "new connect accept", $c->peerhost().":".$c->pe +erport(); # fix me, the thread blocks here, why? # if I lock %{ $cnf{'lock'} } untill close the $c, things +will be OK, # but then we can't gain advantage of the multi-threads no +r the pre-threads # design. if ( my $r = $c->get_request() ) { trc 'dbg', 'new request', $r->as_string; if ( $r->method() eq 'GET' ) { if ( $r->uri()->path() eq '/' ) { my %query = $r->uri()->query_form(); if ( exists $query{'cmd'} ) { my $data; for ( $query{'cmd'} ) { /eph/i and do { $data = $stat{'data'}{ +'eph'}{'data'}; last }; } if ( $data ) { my $res = HTTP::Response->new( 200 ); $res->push_header( 'Expires' => HTTP:: +Date->time2str( $stat{'data'}{'eph'}{'expi'} ) ); $res->push_header( 'Content-Type' => ' +application/ubx' ); $res->content( $data ); $c->send_response( $res ); trc 'log', 'response', 'ok', "send @{ + [ int( length $data ) ] } bytes"; } else { $c->send_error(); trc 'log', 'response', 'error'; } } else { foreach my $f ( glob "$cnf{'doc'}/index.*" + ) { trc 'dbg', 'send', $f; $c->send_file_response( $f ); last; } } } else { ( my $path = $r->uri()->path() ) =~ s|^/|$cnf{ +'doc'}/|; trc 'dbg', 'send', $path; $c->send_file_response( $path ); } } } $c->close(); notify( 'work', $t, 'idle' ); } notify( 'work', $t, 'gone' ); } if ( my $thread = threads->new( \&do_worker ) ) { my $t = $thread->tid; lock %{ $stat{'work'} }; $stat{'work'}{ $t } = &share( {} ); $stat{'work'}{ $t }{'stat'} = 'idle'; trc 'dbg', 'new worker created'; $thread->detach; } else { trc 'dbg', 'new worker created fail'; } } sub notify(@) { $queue_up->enqueue( join ' ', @_ ); }

Replies are listed 'Best First'.
Re: multi thread problem in an simple HTTP server
by BrowserUk (Patriarch) on Apr 09, 2009 at 11:08 UTC

    The first thing you need to fix is this:

    lock %{ $cnf{'lock'} };

    (Hint: enabling warnings would tell you this!)

    You never create a key in %cnf called 'lock'. (And when you do, it would have to contain a reference to a shared variable for that to work.)

    Beyond that, there are several pretty dubious pieces of code:

    1. You have subroutines nested inside subroutines.

      Which almost certainly doesn't do what you think it does, and just serves to confuse things.

    2. Updated in the light of Corion's post below. It seems that END() subs will be treated as END{} blocks...though the sub annotation seems to serve little purpose other than to confused. Making END-blocks compatible with ancient versions of Perl seems to serve little purpose in a program that uses threads.

      You appear to think that your sub END will get called automatically (as you never call it directly), but that is not the case.

      END{} blocks get called automatically, not subs called END().

    3. You have
      # fix me, not work in win32, why? #$SIG{'INT'} = $SIG{'TERM'} = sub { lock %stat; $stat{'done'}++ };

      What do you mean by "not work in win32"?

      If uncommented, that statement (under win32) does install signal handlers for 'INT' and 'TERM' that do get called. So, in what way "not work"?

    4. You also have
      # fix me, the thread blocks here, why? # if I lock %{ $cnf{'lock'} } untill close the $c, things +will be OK, # but then we can't gain advantage of the multi-threads no +r the pre-threads # design. if ( my $r = $c->get_request() ) {

      It is not at all clear what you mean by that?

      To get to that point, this thread has accepted a connection and is now about to fetch the http request that connecting client made. How could it block?

    Overall, your post poses no questions. The source code contains several 'fix me' annotations, but the details are terse and sketchy and seem to reflect your misunderstandings more than problem descriptions that can be addressed.

    To progress this, you're going to have to detail each of the problems explaining what you think should happen and what actually happens.

    You should also explain the format of the urls the clients will send, because expecting us to reverse engineer them from the code is pushing your luck.

    In general, dumping a big chunk of code with a few embedded "fix me"s, is not a great way to go about getting help.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      A small niggle. I guess that for compatibility reasons, there are special names for subroutines that make them get called implicitly:

      > perl -le "sub BEGIN { print shift }; sub END { print shift }" Hello +World

      outputs

      Hello World

      I've used this trick when writing a module for a version of Perl that didn't understand CHECK or INIT blocks (5.04_03 or something like that). I declared these blocks as subroutines and then had another subroutine to execute them if the Perl version was too low.

        Just goes to show that there is always something new to learn, I've never knowingly encountered this form before, but there it is documented. Albiet that it seems to be a non-feature as they are still not callable in the normal way. But, I sit corrected :)

        Five specially named code blocks are executed at the beginning and at the end of a running Perl program. These are the BEGIN, UNITCHECK, CHECK, INIT, and END blocks.

        These code blocks can be prefixed with sub to give the appearance of a subroutine (although this is not considered good style). One should note that these code blocks don't really exist as named subroutines (despite their appearance).


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks for you quick suggestion, and excuse me for my confusing comment. Actually it is the first time I asking for help on the website, I do miss a lot of things. I am learning HTTP server in Perl, following the instruction of the book <Network Programming with Perl, By Lincoln D. Stein, December 15, 2000>, and doing a lot of test, it is my first attempt to implement a server with adaptive pre-threading technology.

      I am working on Microsoft Windows 2000 with Activestate perl v5.8.6 built for MSWin32-x86-multi-thread, and send HTTP request on Firefox web browser with the standard 'GET' method.

      1. lock %{ $cnf{'lock'} } it is a typo, should be lock %{ $stat{'lock'} }
      2. about the nested subroutines: I understand that the subroutines live in the package's symbol table, so it is global not private in the surrounding block, I am doing this just for convenience, they need not be invoked out side of the block. Am I missing something?
      3. thank you for pointing our my misunderstand of the END{} block.
      4. about the signal handler: I intent to stop the server smoothly when press Ctrl+C, if it behaves as what I think, it should increase the $stat{'done'} and then the main loop should break, and END{} is called, exit. But actually it does not, the server does not response to the key sequence, neither when I do like this $SIG{'INT'} = $SIG{'TERM'} = sub { die };
      5. the biggest problem is in the code:
        LOOP: while ( !$stat{'work'}{$t}{'stat'} ne 'done' ) { { lock %{ $stat{'lock'} }; $c = $listener->accept() or next LOOP; } notify( 'work', $t, 'busy' ); trc 'log', "new connect accept", $c->peerhost().":".$c->pe +erport(); # fix me, the thread blocks here, why? # if I lock %{ $cnf{'lock'} } untill close the $c, things +will be OK, # but then we can't gain advantage of the multi-threads no +r the pre-threads # design. if ( my $r = $c->get_request() ) {
        when running, one thread get the 'lock' and accept a new connection, then 'unlock', and it is blocked at the last line. I had done following test,
        1. remove the block surrounding the lock action, like this
          # { lock %{ $stat{'lock'} }; $c = $listener->accept() or next LOOP; # }
          it is OK, except the other threads can't accept new connection during the current thread handling the request, because the 'lock' is not released until the handling is finished, the multi-thread server becomes a serial one.
        2. when I remove the get_request() block, send data once the new connection is accepted regardless of the request, it is OK now.
        3. It is really confusing to me.

      6. How does the thread context switch, one more question in
        sub do_worker { my $t = threads->self()->tid(); # wait for the main thread finish initialization # fix me, yield can't achieve our purpose, why? # threads->yield(); sleep(1);
        if I do not add the last line of code, the main thread do not spawn enough 'worker' threads at the first time, only 2 ones created, but it should be 8 at this time. I guess the current thread is switched to the main thread at that time so I add threads->yield(), then one more 'worker' threads is spawned at the first time, but it is still not what I want. At last I add sleep(1), it is OK then, why?
      Looking forward to you kindly help.

        Okay. Let's get the easy ones out the way first.

        1. Typo. Fair enough, we all make those--especially me. But ...

          With the typo, you aren't locking anything, so the conclusions you have drawn on the basis that you thought that you were doing locking, are now invalid.

        2. Nested subroutines. "for convenience"

          In Perl, subroutines declared within a nested scope are:

          • compiled when the file is loaded;
          • visible outside the subroutine in which they are declared;
          • callable immediately they have been compiled (from anywhere);

          In other words, nesting them serves no purpose other than confusion. It creates expectations in those that do not know the above that their scope is limited. It creates confusion that observers (eg. me), think that the author of the code (eg. you), may believe that their scope is limited.

          The bottom line is that I don't see any convenience; just a source of confusion.

        3. END{} blocks.

          Okay. I learnt something new also :)

        4. Signal handler:

          The problem here is this:

          while ( !$stat{'done'} ) { if ( my $msg = $queue_up->dequeue() ) {

          That loop spends most of its time on the second of those two lines. Waiting to dequeue() a message from the queue. But dequeue() is a blocking call, so it won't get back to the top of the loop to check $stat{done} until a message is received.

          The solution is to use dequeue_nb() (nb for no block). That will ensure that the terminating condition is checked frequently. However, you will also want to add a short sleep to prevent the loop from thrashing the cpu.

          while ( !$stat{'done'} ) { if ( my $msg = $queue_up->dequeue_nb() ) { ... } else { sleep 1; ## Use usleep or select for better responsiveness } }
        5. the biggest problem is in the code:

          I'll come back to this in a separate reply.

        6. Also moved to a separate reply as there is a lot to deal with.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        "if I do not add the last line of code, the main thread do not spawn enough 'worker' threads at the first time, only 2 ones created, but it should be 8 at this time. I guess the current thread is switched to the main thread at that time so I add threads->yield(), then one more 'worker' threads is spawned at the first time, but it is still not what I want. At last I add sleep(1), it is OK then, why?"

        You are misunderstanding what is going on here. Without the sleep, all the threads do get started--it just takes a minute or so for it to happen.

        The reason for the delay comes down to the locking you are using, but I'll come back to that. There is a far more fundamental issue to deal with first. That of how you are creating your worker threads:

        You are spawning each thread, using the thread handle return to obtain the thread ID, and using that to set up the control structures used by the thread:

        if ( my $thread = threads->new( \&do_worker ) ) { my $t = $thread->tid; lock %{ $stat{'work'} }; $stat{'work'}{ $t } = &share( {} ); $stat{'work'}{ $t }{'stat'} = 'idle';

        But, the first thing your thread does is attempt to use the control structure:

        sub do_worker { my $t = threads->self()->tid(); # sleep(1); my $c; LOOP: while ( !$stat{'work'}{$t}{'stat'} ne 'done' ) {

        Which your main thread hasn't yet created!

        And, you are also accessing that shared control structure without having locked it!

        To correct these problems, you could move the per-thread control structure creation inside the thread itself, and apply locking to the accesses:

        sub do_worker { my $t = threads->self()->tid(); { lock %{ $stat{'work'} }; $stat{'work'}{ $t } = &share( {} ); $stat{'work'}{ $t }{'stat'} = 'idle'; } my $c; LOOP: while ( do{ lock %{ $stat{'work'}{$t} }; $stat{'work'}{$t}{'s +tat'} } ne 'done' ) { ... if ( my $thread = threads->new( \&do_worker ) ) { my $t = $thread->tid; trc 'dbg', 'new worker created'; $thread->detach; ...

        But that raises another question: As these are 'per-thread' control structures, why are you having to lock them?

        Looking at the places outside of the worker thread where they are accessed, it all happens in you main thread loop. But your main thread loop is controlled by a message queue, and the messages are being sent from the worker threads.

        So the question then becomes: why send a message from a worker thread to the main thread to have it modify the shared, per-thread control structures when the worker threads could make those modifications directly?

        Indeed, the more I think about that logic, the more questionable the reasons for having a queue at all. There are other questions about your architecture that need answering, but I'll get back to them later.


        Now back to the question of why worker threads start up so slowly unless you add the sleep 1;.

        The problem arises here:

        LOOP: while ( $stat{'work'}{$t}{'stat'} ne 'done' ) { { lock %{ $stat{'lock'} }; $c = $listener->accept() or next LOOP; }

        That lock ($stat{'lock'}) is a global lock(*). And accept() is a blocking call. So, once one thread has entered that lock and is waiting for a connection, no other thread will be able to progress past that point until the first one has either: received a connection; or the accept() times out.

        Basically, the workers start slowly because you have programmed them to do so.

        You can demonstrate this to yourself by adjusting the timeout

        timeout => 10,</code>. <P>If you remove the <c>sleep 1;
        and set the timeout to 1, then you'll see that the 8 workers start much more quickly. One per second (as the accept() times out) instead of one every 10 seconds.

        Other problems:

        (*)Why are you using a shared hash for locking? You never put anything inside that hash, so a shared scalar would do the job just as well. And probably with less overheads.

        Your whole locking strategy needs review. For example, you are using code like this:

        lock %{ $stat{'work'} }; ... $stat{'work'}{$t}{'stat'} = $s; $stat{'work'}{$t}{'time'} = time(); $stat{'work'}{$t}{'work'}++ if $s eq 'idle';

        You are applying a global lock, when you only need to access per-thread data.That means all threads will be blocked for the duration of that lock which will have a detrimental affect on the overall performance.

        You could avoid that, but only applying a lock to the per-thread that you wish to modify:

        lock %{ $stat{'work'}{$t} }; ... $stat{'work'}{$t}{'stat'} = $s; $stat{'work'}{$t}{'time'} = time(); $stat{'work'}{$t}{'work'}++ if $s eq 'idle';

        which would leave all the other thread free to run.

        However, there is still the question of whether this per-thread data needs to be shared in the first place.

        Enough for this post. More to follow.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        the biggest problem is in the code: ...when running, one thread get the 'lock' and accept a new connection, then 'unlock', and it is blocked at the last line. I had done following test,

        This will be affected by the changes required to address the other problems I've already tackled, but there is a particularly important point to make. Your basic architecture has one big flaw at its core:

        You are calling $listener->accept() in all your threads.

        The normal multi-processing (be it threading or forking), server architecture has a main loop and a single point where incoming connects are accepted. It then either: spawns a thread or process to handle the newly connected client; or passes the client handle to a pre-existing (pre-forked) thread or process to deal with.

        With your architecture as it stands, although each worker will be using a CLONED copy of the listener socket at the perl-level, underlying the perl-level structures and data, at some point within the C runtime, the OS, or the tcpip stack, they are all trying to use and control a single socket

        And whilst you are applying Perl-level locking on the resource which should prevent any Perl-level sharing problems, it is not at all clear to me what the effect of calling accept() on that single socket from multiple threads will be. Basically, I've never seen it done that way in either Perl or C.

        Maybe it is fine if you only enter into the accept state on one thread concurrently (per your locking). But maybe not. And it is quite likely to be affected by the C runtime and/or OS you are running.

        Maybe it is a clever way of dodging the 'socket passing' problem, but I'd have to either see a (fully working) example of the technique in use, or code up a (greatly simplified) example of my own to convince myself that it works correctly under load.


        Bottom line: there are a lot of basic errors in your code and some questionable architecture. It's not easy to suggest how to fix it up completely, as how you would correct the basic problems, (and whether they would be still be needed), really depends on how you decide to address the architectural issues.

        For example, if you take my advise about modifying the per-thread control structures within the threads themselves, rather than passing messages to the main thread asking it to do it, then the reason for having the queue pretty much disappears. All that would leave your main thread to do, is adjusting the number of workers in the pool. But even the way you are doing that is questionable.

        The normal method with a pre-forking server is to start a new thread if there is no idle thread available to field a new connection. I see what you are aiming for with your low-water mark mechanism--always having at least two idle threads available--but I haven't managed to push your code sufficiently hard to create the situation where that comes into play. And I am dubious abut he way you have it coded.

        Each connection is so brief that its pretty much impossible to test multiple concurrent connections, when creating them manually via browser. It would require some kind of automated client set up to drive it hard enough to test that scenario. But given all the other problems that need fixing, plus the open architectural questions, it is not worth the effort of testing that as things stand.

        So, the balls in your court to decide how you are going to proceed from here?

        I may get time to throw something together to explore the multiple-listeners question sometime, but the rest will have to wait until you decide how you are going to proceed.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: multi thread problem in an simple HTTP server
by CountZero (Bishop) on Apr 09, 2009 at 10:18 UTC
    Just saying
    # fix me, not work in win32, why?
    is unlikely to get you good answers.

    What do you mean by "not work in win32". Does it gives out any error messages? What did you expect this code to achieve and what actually happened? ...

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: multi thread problem in an simple HTTP server
by BrowserUk (Patriarch) on Apr 11, 2009 at 01:52 UTC

    I corrected it early on and then forgot about it.

    This is what you would have seen had you had warnings enabled:

    Argument "HTTP::Date" isn't numeric in gmtime at C:/Perl64/lib/HTTP/Da +te.pm line 26

    The line in your code that causes this error is:

    $res->push_header( 'Expires' => HTTP::Date->time2str( $stat{'data'}{'e +ph'}{'expi'} ) );

    You are calling time2str as class method when it is expecting to be called as a function:

    $res->push_header( 'Expires' => HTTP::Date::time2str( $stat{'data'}{'e +ph'}{'expi'} ) );

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      You taught me an important lesson: the '-w' should be switched on at the developing stage. I used to force myself to add 'use warnings', but can't remember from what time I gave up such a good habit. You saved me from out-of-control.

      At last I decide to switch to Cywin environment, because it can work perfectly there. I will continue to fix bug, apply suggestion, and add new features. Here is the updated version, any suggestion will be appriciated, thanks in advance.

      #!/usr/bin/perl # created by joe.zheng at Wed Apr 1 10:23:14 2009 # tested in Cygwin, perl v5.8.7 built for cygwin-thread-multi-64int use strict; use warnings; use 5.008; use threads; # multi-threads needed use threads::shared; # need to share some variables use Thread::Queue; # thread safe communication needed use Getopt::Long; use HTTP::Daemon; use SDBM_File; use Time::HiRes qw( time ); use List::Util qw( min ); # subroutine proto type sub trc (@); # global variables my $queue = Thread::Queue->new; my %self; # configuration my %cnf = ( # root server settings srv => 'agps.u-blox.com:46434', # server URL cmd => 'eph', # command usr => 'usr', # username pwd => 'pwd', # password # default position, being used for data from the root server. lat => 31, # degrees lon => 121, # degrees acc => 1000, # meters # our server settings port => 46434, # listen port queue => 5, # listen queue timeout => 10, # socket timeout # the extent server redirect to when the main server # can't serve the request, such as reach its capacity, # the data requested is not available, etc. ext_srv => '', workers_ini => 4, # prethread count workers_max => 20, # max worker thread allowed workers_idle_lo => 2, # low water mark workers_idle_hi => 8, # high water mark period => 600, # period to check new data # directory doc => 'doc', # the root directory where all the docs live in # misc dbg => 1, # print log for debug purpose ); # parse command line parameter GetOptions( \%cnf, 'usr=s', 'pwd=s', 'lat=f', 'lon=f', 'acc=i', 'srv=s', 'cmd=s', 'period=f', 'port=i', 'queue=i', 'timeout=f', 'ext_srv=s', 'doc=s', 'dbg=i', ); # ----------------------- # main routine # ----------------------- # set signal handler $SIG{'INT'} = $SIG{'TERM'} = sub { lock $self{'stat'}; $self{'stat'} = 'done'; }; init_srv(); spawn_worker() foreach ( 1 .. $cnf{'workers_ini'} ); spawn_miner(); trc 'dbg', 'start main loop'; MAIN_LOOP: while ( $self{'stat'} ne 'done' ) { if ( my $msg = $queue->dequeue() ) { trc 'dbg', 'new msg'; my ( $class, $payload ) = split /\s+/, $msg, 2; if ( $class eq 'worker' ) { my ( $t, $s ) = split /\s+/, $payload, 2; lock %{ $self{'worker'}{$t} }; trc 'dbg', 'worker stat', $t, $s; if ( $s eq 'gone' ) { delete $self{'worker'}{$t}; } else { $self{'worker'}{$t}{'stat'} = $s; $self{'worker'}{$t}{'time'} = time(); $self{'worker'}{$t}{'work'}++ if $s eq 'idle'; } } elsif ( $class eq 'miner' ) { my ( $t, $d ) = split /\s+/, $payload, 2; if ($d) { lock %{ $self{'miner'}{$t} }; trc 'dbg', 'miner data', $t, length $d; $self{'miner'}{$t}{'expi'} = time() + $cnf{'period'}; $self{'miner'}{$t}{'data'} = $d; } } elsif ( $class eq 'beat' ) { # heart beat event # now we can monitor the thread's activity trc 'dbg', 'heart beat', $payload; } else { # nothing to do } # shortcut # handle all the pending messages first next MAIN_LOOP if $queue->pending(); } # reap worker if necessary # adjust worker my @worker_idle = sort { $a <=> $b } grep { $self{'worker'}{$_}{'stat'} eq 'idle' } keys %{ $self{'worker'} }; my ( $worker_cnt_ttl, $worker_cnt_idl ) = ( scalar keys %{ $self{'worker'} }, scalar @worker_idle ); trc 'dbg', 'idle worker', $worker_cnt_idl, 'total worker', $worker_cnt_ttl; if ( $worker_cnt_idl < $cnf{'workers_idle_lo'} ) { my $cnt = min( $cnf{'workers_idle_lo'} - $worker_cnt_idl, $cnf{'workers_max'} - $worker_cnt_ttl, ); spawn_worker() foreach ( 1 .. $cnt ); } elsif ( $worker_cnt_idl > $cnf{'workers_idle_hi'} ) { my $cnt = $worker_cnt_idl - $cnf{'workers_idle_hi'}; kill_worker( $worker_idle[ $_ - 1 ] ) foreach ( 1 .. $cnt ); } } trc 'dbg', 'stop main loop'; END { trc 'dbg', 'start to terminate ...'; # close all the worker foreach ( keys %{ $self{'worker'} } ) { kill_worker($_); } # need to wait? trc 'dbg', 'terminated'; } # ----------------------- # subroutine definitions # ----------------------- sub retrieve_data { # dummy data here # fetch from root server return 'dummy data, comming soon ...'; } sub get_req_data { # dummy data here # fetch the request data from miner's repo return { data => 'dummy data, comming soon ...', expi => time() + $cnf{'period'}, }; } sub trc (@) { local $, = ', '; if ( $_[0] eq 'dbg' ) { # shortcut return if !$cnf{'dbg'}; print scalar(localtime), threads->self()->tid(), @_; } else { print scalar(localtime), @_; } print "\n"; } sub init_srv { # print out the configuration for debug purpose if ( $cnf{'dbg'} ) { print '<configuration>', "\n"; foreach my $k ( sort keys %cnf ) { print $k, ': ', $cnf{$k}, "\n"; } print '</configuration>', "\n"; } # main thread my $srv = HTTP::Daemon->new( LocalPort => $cnf{'port'}, Reuse => 1, Listen => $cnf{'queue'}, Timeout => $cnf{'timeout'}, ) or die "can't create local socket: $@\n"; trc 'log', "Accepting connections on", $srv->sockhost() . ':' . $srv->sockport(); { $self{'sock'} = $srv; $self{'stat'} = 'busy', share( $self{'stat'} ); $self{'lock'} = 'lock', share( $self{'lock'} ); $self{'miner'} = &share( {} ); $self{'worker'} = &share( {} ); } } sub do_miner { my $t = threads->self()->tid(); my @p = @_; trc 'dbg', 'do_miner'; { lock( %{ $self{'miner'} } ); $self{'miner'}{$t} = &share( {} ); } # periodically check for ephemeris while ( $self{'stat'} ne 'done' ) { # Check for new ephemeris notify( 'miner', $t, retrieve_data(@p) ); select undef, undef, undef, $cnf{'period'}; } trc 'dbg', 'miner gone'; } sub spawn_miner { if ( my $thread = threads->new( \&do_miner, @_ ) ) { trc 'dbg', 'new miner created'; $thread->detach; } else { trc 'dbg', 'new miner create failed'; # force to success, otherwise the service is not available die; } } sub do_worker { my $t = threads->self()->tid(); trc 'dbg', 'do_worker'; { lock %{ $self{'worker'} }; $self{'worker'}{$t} = &share( {} ); $self{'worker'}{$t}{'stat'} = 'idle'; } my $c; ACCEPT: while ( $self{'worker'}{$t}{'stat'} ne 'done' ) { { trc 'dbg', 'current stat', $self{'worker'}{$t}{'stat'}; notify( 'beat', $t, '' ); lock $self{'lock'}; next ACCEPT unless $c = $self{'sock'}->accept(); } notify( 'worker', $t, 'busy' ); trc 'log', "new connect accept", $c->peerhost() . ":" . $c->peerport(); if ( my $r = $c->get_request() ) { handle_req( $c, $r ); } $c->close(); notify( 'worker', $t, 'idle' ); } notify( 'worker', $t, 'gone' ); trc 'dbg', 'worker gone'; } sub spawn_worker { if ( my $thread = threads->new( \&do_worker ) ) { trc 'dbg', 'new worker created'; $thread->detach; } else { trc 'dbg', 'new worker create failed'; } } sub kill_worker { my ($t) = @_; trc 'dbg', 'kill worker', $t; lock %{ $self{'worker'}{$t} }; $self{'worker'}{$t}{'stat'} = 'done'; } sub notify { $queue->enqueue( join ' ', @_ ); } sub handle_req { my ( $c, $r ) = @_; trc 'dbg', 'new request', $r->as_string; if ( $r->method() eq 'GET' ) { if ( $r->uri()->path() eq '/' ) { my %query = $r->uri()->query_form(); if ( exists $query{'cmd'} ) { my $data; for ( $query{'cmd'} ) { /eph/i and do { $data = get_req_data(%query); last }; } if ($data) { my $res = HTTP::Response->new(200); my $exp = HTTP::Date::time2str( $data->{'expi'} ); $res->push_header( 'Expires' => $exp ); $res->push_header( 'Content-Type' => 'application/ +ubx' ); $res->content( $data->{'data'} ); $c->send_response($res); trc 'log', 'response', 'ok', 'data len', length $data->{'data'}; } else { $c->send_error(); trc 'log', 'response', 'error'; } } else { foreach my $f ( glob "$cnf{'doc'}/index.*" ) { trc 'dbg', 'send', $f; $c->send_file_response($f); last; } } } else { ( my $path = $r->uri()->path() ) =~ s|^/|$cnf{'doc'}/|; trc 'dbg', 'send', $path; $c->send_file_response($path); } } else { $c->send_error( 503, 'method' . $r->method() . 'not supported' + ); } }
        At last I decide to switch to Cywin environment, because it can work perfectly there.

        Hm. Why you think you need to "switch to Cygwin environment" is beyond me. You code as posted run perfectly fine as designed under both Vista 64 and XP-32. And I see nothing in your code that would require a POSIX layer. All Cygwin is going to do is make run even more sluggishly(*) than it already does.

        (*)Example 1: You've ignored my suggestion to use Thread::Queue::dequeue_nb() and substituted your 'beat' messages to try and ensure that your MAIN_LOOP wakes up in a timely fashion. But your beat message will only get sent each time one of your threads either accepts and in-bound connection; or its accept() times out! With your timeout set to 10 seconds, nothing will happen in your MAIN_LOOP for up to 10 seconds unless you are receiving connections. If you had substituted dequeue_nb() and a short sleep in the MAIN _LOOP, it would have remained as responsive as you chose--every 1 second or 1/10th second, regardless of your accept timeout or traffic load.

        But the real kicker is--as I identified earlier--your queue serves no purpose! Every action taken as a result of one of your notify() messages is to adjust values in thread specific shared variables. So, why send a message from your worker threads to your main thread to adjust those values, when those workers have direct access to those shared variables? All you are doing is delay those actions through several layers of unnecessary locking--those associated with the queue itself, and those that prevent your main thread from accessing the thread specific shared data. If each thread adjusted its own state variables--neither type of locking wold be required!

        (*)Example 2: In your do_miner(), you have this notify:

        while ( $self{'stat'} ne 'done' ) { # Check for new ephemeris notify( 'miner', $t, retrieve_data(@p) ); select undef, undef, undef, $cnf{'period'}; }

        which sends a message to your MAIN_LOOP that invokes this code:

        elsif ( $class eq 'miner' ) { my ( $t, $d ) = split /\s+/, $payload, 2; if ($d) { lock %{ $self{'miner'}{$t} }; trc 'dbg', 'miner data', $t, length $d; $self{'miner'}{$t}{'expi'} = time() + $cnf{'period'}; $self{'miner'}{$t}{'data'} = $d; } }

        But why have your miner thread send a message (via a queue with all the context switching and locking that involves), in order to set two shared variables (involving further locking), when that thread already has direct access to those two thread specific, shared variables?

        Both those lumps of code can be trivially reduced to:

        # periodically check for ephemeris while ( $self{'stat'} ne 'done' ) { # Check for new ephemeris $self{'miner'}{$t}{'expi'} = time() + $cnf{'period'}; $self{'miner'}{$t}{'data'} = retrieve_data(@p); select undef, undef, undef, $cnf{'period'}; }

        Removing two layers of locking and context switches. You may say that it only happens every 600 seconds, but why have a thread wake up every 600 seconds to send a message to the main loop asking it to break of fromdoing the other things it is charged with--when that thread can do those things itself?

        In addition, by having both your miner thread and all your worker threads take care of maintaining their own stated, the reason for the queue disappears entirely. That leaves your MAIN_LOOP just one job to do: monitor the worker thread states and ramp their numbers up or down as required. Something that currently only gets done if the queue is empty:

        # shortcut # handle all the pending messages first next MAIN_LOOP if $queue->pending(); }

        Which it never will be at the times when it is most needed--when you are getting lots of inbound accepts--because your workers are tying the main loop up with pointless requests to have the MAIN_LOOP maintain their state.

        Here is the updated version, any suggestion will be appriciated,

        As you've ignored just about all the suggestions I already made, there seems little point in repeating them.

        However, as I've tested my 'no notify - no queue' theories, and found that even with 80 clients pounding away flat out, it can handle them perfectly whilst remaining responsive to both new clients and keyboard (sig int) under Vista and XP, whereas your version starts timing clients out after 60 seconds with only 15 clients trying to access it, you might as well have the modified code:

        BTW: A far better use of a queue in your server would be to queue all your trc() messages to a single thread for output rather than slowing all your workers down waiting for access to a global lock. But I left that as an exercise for the reader.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.