in reply to Win32::MMF + threads misbehavior

I just knocked up a crude trace monitor based around a threaded, tcp server receiving trace information from DB::DB based, per thread clients.

Here's a sample of trace from 2 copies of a multi-threaded test app I had kicking around that are each running 100 threads:

# hires timestamp pid tid script lineno package subro +utine args 1144302961.37707 3168( 38) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.29688 3216( 41) threadtest.pl( 29) (eval): n/a 1144302961.29714 3216( 41) threadtest.pl( 29) (eval): n/a 1144302961.29730 3216( 41) threadtest.pl( 29) (eval): n/a 1144302961.29743 3216( 41) threadtest.pl( 29) (eval): n/a 1144302961.29756 3216( 41) threadtest.pl( 29) (eval): n/a 1144302961.29769 3216( 41) threadtest.pl( 16) Win32::Console::Write +Char: n/a 1144302961.29782 3216( 41) threadtest.pl( 16) Win32::Console::Write +Char: n/a 1144302961.29795 3216( 41) threadtest.pl( 16) Win32::Console::Write +Char: n/a 1144302961.29818 3216( 41) threadtest.pl( 29) (eval): n/a 1144302961.35938 3216( 0) ( ) : n/a 1144302961.35980 3216( 0) ( ) : n/a 1144302961.40625 3216( 42) threadtest.pl( 29) (eval): n/a 1144302961.40652 3216( 42) threadtest.pl( 29) (eval): n/a 1144302961.40668 3216( 42) threadtest.pl( 29) (eval): n/a 1144302961.40681 3216( 42) threadtest.pl( 29) (eval): n/a 1144302961.40694 3216( 42) threadtest.pl( 29) (eval): n/a 1144302961.40706 3216( 42) threadtest.pl( 16) Win32::Console::Write +Char: n/a 1144302961.40719 3216( 42) threadtest.pl( 16) Win32::Console::Write +Char: n/a 1144302961.40732 3216( 42) threadtest.pl( 16) Win32::Console::Write +Char: n/a 1144302961.40754 3216( 42) threadtest.pl( 29) (eval): n/a 1144302961.49839 3168( 38) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.49881 3168( 38) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.49897 3168( 38) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50139 3168( 28) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50156 3168( 28) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50169 3168( 28) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50000 3168( 31) threadtest.pl( 29) (eval): n/a 1144302961.50016 3168( 31) threadtest.pl( 29) (eval): n/a 1144302961.50028 3168( 31) threadtest.pl( 29) (eval): n/a 1144302961.50041 3168( 31) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50054 3168( 31) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50066 3168( 31) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50000 3168( 7) threadtest.pl( 29) (eval): n/a 1144302961.50016 3168( 7) threadtest.pl( 29) (eval): n/a 1144302961.50028 3168( 7) threadtest.pl( 29) (eval): n/a 1144302961.50041 3168( 7) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50054 3168( 7) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50066 3168( 7) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50000 3168( 15) threadtest.pl( 29) (eval): n/a 1144302961.50016 3168( 15) threadtest.pl( 29) (eval): n/a 1144302961.50028 3168( 15) threadtest.pl( 29) (eval): n/a 1144302961.50041 3168( 15) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50053 3168( 15) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.50065 3168( 15) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.48975 3168( 6) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.48991 3168( 6) threadtest.pl( 19) Win32::Console::Write +Char: n/a 1144302961.49124 3168( 14) threadtest.pl( 19) Win32::Console::Write +Char: n/a

The 'n/a' is meant to be the subroutine args, but I haven't worked out how to obtain those yet.

You invoke the clients in the usual debugger fashion:

tperl -d:Ttrace threadtest.pl

This is the debug client module(crude):

package Devel::Ttrace; use strict; use warnings; use Time::HiRes qw[ time ]; require IO::Socket::INET; require threads; my $socket = IO::Socket::INET->new( 'localhost:54321' ); sub DB::DB { no warnings 'uninitialized'; my @caller = caller(1); printf $socket "%15.5f %5d(%3d) %s(%5s) %s: %s\n", time(), $$, threads->tid, @caller[ 1, 2, 3 ], 'n/a' ; } 1;

I ran the server, monitor.pl in another console session and just dumped the output from the cetral queue to the screen to produce the above output. You cold modify the Worker thread to

#!perl -slw use strict; use threads; use Thread::Queue; use Smart::Comments::Lite '0'; sub worker { require IO::Socket::INET; my $tid = threads->tid; our $running:shared; our $busy:shared; our $die:shared; my( $Qwork, $Qtrace ) = @_; ++$running; while( my $fno = $Qwork->dequeue() ) { ##1 warn "$tid($fno): Client waiting on $fno\n"; my $client = IO::Socket::INET->new; $client->fdopen( $fno, '+>' ) or die "$tid: Failed to reopen fileno: $fno"; ##2 warn "$tid($fno): reopened $client\n"; ++$busy; ##2 warn "$tid($fno): Reading from client\n"; while( my $data = <$client> ) { chomp $data; ##2 warn "$tid($fno): Got '$data'\n"; $Qtrace->enqueue( $data ); } ##2 warn "$tid($fno): Client disconnected\n"; close $client; shutdown( $client, 2 ); --$busy; } --$running; } our $START ||= 2; our $MAX ||= 2000; our $running:shared = 0; our $busy:shared = 0; our $die:shared = 0; our $connects:shared=0; ## Use ctrl-break to terminate the server. local $SIG{INT} = sub{ warn "SIGINT(2) terminating"; $die = 1; }; my $Qwork = new Thread::Queue; my $Qtrace = new Thread::Queue; our %clients; my @threads = map{ threads->create( \&worker, $Qwork, $Qtrace ) or warn( "Create thread $_ failed with $^E\n" ), (); } 1 .. $START; ## This creates new threads if the current pool is running low async{ until( $die ) { if( ( $busy + 2 ) == $running and $running < $MAX ) { push @threads, threads->create( \&worker, $Qwork, $Qtrace +) or warn( "Create thread $_ failed with $^E\n" ), (); } sleep 1; } }; ## This reads and dumps the trace information to the screen. async{ until( $die ) { print $Qtrace->dequeue(); } } require IO::Socket::INET; my $server = IO::Socket::INET->new( LocalPort => 54321, Listen => 1000, Reuse => 1. ) or die $!, $^E; ## Currently, the client sockets are not cleaned up and will eventuall +y run out of resource. ## Fixes welcome :) my $fno; while( not $die and ( my $client = $server->accept ) > 0 ) { ++$connects; $clients{ $client->fileno } = $client; $Qwork->enqueue( $client->fileno ); } warn "\a*** Accept failed ***\a\n" and $die = 1; close $server; $Qwork->enqueue( undef ) for 1 .. $running; sleep 1 while $running; $_->join for @threads;

Smart::Comments::Lite is my doctored version of the theDamian's CPAN tool. Comment out the use line and it will disable it completely.

This was hacked together from bits of existing code in about one hour. It imposes very little load on the programs being traced and produced sequenced information that ought to make working out where your application is disappearing up it own navel fairly easy. Redirecting the output to a file, (via tee might be good), would allow you to get a permanent record of the sequence and timing of events that lead up to the problem.

HTH.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Win32::MMF + threads misbehavior
by renodino (Curate) on Apr 06, 2006 at 06:46 UTC
    Not certain how you conclude "imposes very little overhead", but a socket write (local or not) + a printf for every Perl statement looks like a lot of overhead to me. Esp. if the problem thread is in a tight loop (altho the socket write will block eventually, so I guess that mitigates that issue)

    While I appreciate your efforts, I'm already concerned about the overhead of pack()ing an integer and a float prior to writing thru the tie() for every statement, so socket writes are not an acceptable solution for my purposes.

      I'm already concerned about the overhead of pack()ing an integer and a float prior to writing thru the tie() for every statement, so socket writes are not an acceptable solution for my purposes.

      I hate to tell you this, but tieing isn't quick. In the case of MMF, around 2 to 3 times slower than writing to a socket.

      #! perl -slw use strict; use Win32::MMF::Shareable; use IO::Socket::INET; use Benchmark qw[ cmpthese ]; our $DSIZE ||= 100; my $mmf; tie $mmf, 'Win32::MMF::Shareable', 'mmf', { namespace => 'Win32MMFTest', size => 10000, reuse => 0 }; my $sock = IO::Socket::INET->new( 'localhost:54321' ); my $data = 'X' x $DSIZE; cmpthese -1, { MMF => sub { $mmf = $data }, TCP => sub { print $sock $data } }; __END__ C:\test>MMF-IO-b -DSIZE=8 Rate MMF TCP MMF 28872/s -- -77% TCP 123636/s 328% -- C:\test>MMF-IO-b -DSIZE=80 Rate MMF TCP MMF 27401/s -- -73% TCP 102909/s 276% -- C:\test>MMF-IO-b -DSIZE=800 Rate MMF TCP MMF 28295/s -- -67% TCP 86245/s 205% --

      And that could probably be speeded up by playing with the buffer sizes and upping the priority of the read threads.

      Not certain how you conclude "imposes very little overhead", but a socket write (local or not) + a printf for every Perl statement looks like a lot of overhead to me. Esp. if the problem thread is in a tight loop (altho the socket write will block eventually, so I guess that mitigates that issue)

      Not sure why you think it would block? The other ends of each of those sockets are being service by a dedicated thread that reads a line and posts it to a queue. With 2 processes running 100 threads, there appears to be negligable slowdown on the app under test and each of those 100 200 threads is in a tight loop incrementing a variable and outputting it to the console.

      But, I can see I am wasting your time with a NIH solution you do not want, so I'll stop.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I know Win32::MMF has issues (mostly because its trying to be too clever, and locking everything in sight). But it was the tool at hand for Win32...however, given its issues, it may be a wash and I'll end up writing a simple variant that just does the mmap() equivalent, sans tie()'s and locks. And, having worked on both socket and memory mgmt internals of a few OS's, I know that sockets have more overhead than a pure mmap() solution. The issue is that Win32::MMF adds a lot of extra baggage; remove the baggage, and I'd expect a mmf to be significantly faster, and much lower overhead, than sockets and thread queues; after all, its just writing to its memory.

        "Not sure why you think it would block?"

        My monitor application doesn't run all the time, only when its needed to see whats happening. So eventually, sockets are going to run out of kernel buffers, and they're going to block.

        I'm going to ignore Win32 for now, get the Sys::Mmap version working, and then see what it would take to deconstruct Win32::MMF back down to a minimal mmap() implementation (and maybe do the same for Sys::Mmap if needed). Who knows, maybe I'll end up with a nicely OS agnostic mmap() package for Perl.