Re: Win32::MMF + threads misbehavior

I just knocked up a crude trace monitor based around a threaded, tcp server receiving trace information from DB::DB based, per thread clients.

Here's a sample of trace from 2 copies of a multi-threaded test app I had kicking around that are each running 100 threads:

# hires timestamp pid  tid  script       lineno  package         subro
+utine args
1144302961.37707  3168( 38) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.29688  3216( 41) threadtest.pl(   29) (eval): n/a
1144302961.29714  3216( 41) threadtest.pl(   29) (eval): n/a
1144302961.29730  3216( 41) threadtest.pl(   29) (eval): n/a
1144302961.29743  3216( 41) threadtest.pl(   29) (eval): n/a
1144302961.29756  3216( 41) threadtest.pl(   29) (eval): n/a
1144302961.29769  3216( 41) threadtest.pl(   16) Win32::Console::Write
+Char: n/a
1144302961.29782  3216( 41) threadtest.pl(   16) Win32::Console::Write
+Char: n/a
1144302961.29795  3216( 41) threadtest.pl(   16) Win32::Console::Write
+Char: n/a
1144302961.29818  3216( 41) threadtest.pl(   29) (eval): n/a
1144302961.35938  3216(  0) (     ) : n/a
1144302961.35980  3216(  0) (     ) : n/a
1144302961.40625  3216( 42) threadtest.pl(   29) (eval): n/a
1144302961.40652  3216( 42) threadtest.pl(   29) (eval): n/a
1144302961.40668  3216( 42) threadtest.pl(   29) (eval): n/a
1144302961.40681  3216( 42) threadtest.pl(   29) (eval): n/a
1144302961.40694  3216( 42) threadtest.pl(   29) (eval): n/a
1144302961.40706  3216( 42) threadtest.pl(   16) Win32::Console::Write
+Char: n/a
1144302961.40719  3216( 42) threadtest.pl(   16) Win32::Console::Write
+Char: n/a
1144302961.40732  3216( 42) threadtest.pl(   16) Win32::Console::Write
+Char: n/a
1144302961.40754  3216( 42) threadtest.pl(   29) (eval): n/a
1144302961.49839  3168( 38) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.49881  3168( 38) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.49897  3168( 38) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50139  3168( 28) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50156  3168( 28) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50169  3168( 28) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50000  3168( 31) threadtest.pl(   29) (eval): n/a
1144302961.50016  3168( 31) threadtest.pl(   29) (eval): n/a
1144302961.50028  3168( 31) threadtest.pl(   29) (eval): n/a
1144302961.50041  3168( 31) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50054  3168( 31) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50066  3168( 31) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50000  3168(  7) threadtest.pl(   29) (eval): n/a
1144302961.50016  3168(  7) threadtest.pl(   29) (eval): n/a
1144302961.50028  3168(  7) threadtest.pl(   29) (eval): n/a
1144302961.50041  3168(  7) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50054  3168(  7) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50066  3168(  7) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50000  3168( 15) threadtest.pl(   29) (eval): n/a
1144302961.50016  3168( 15) threadtest.pl(   29) (eval): n/a
1144302961.50028  3168( 15) threadtest.pl(   29) (eval): n/a
1144302961.50041  3168( 15) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50053  3168( 15) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.50065  3168( 15) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.48975  3168(  6) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.48991  3168(  6) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
1144302961.49124  3168( 14) threadtest.pl(   19) Win32::Console::Write
+Char: n/a
[download]

The 'n/a' is meant to be the subroutine args, but I haven't worked out how to obtain those yet.

You invoke the clients in the usual debugger fashion:

tperl -d:Ttrace threadtest.pl
[download]

This is the debug client module(crude):

package Devel::Ttrace;
use strict;
use warnings;
use Time::HiRes qw[ time ];
require IO::Socket::INET;
require threads;

my $socket = IO::Socket::INET->new( 'localhost:54321' );

sub DB::DB {
    no warnings 'uninitialized';
    my @caller = caller(1);
    printf $socket "%15.5f %5d(%3d) %s(%5s) %s: %s\n",
        time(),
        $$,
        threads->tid,
        @caller[ 1, 2, 3 ],
        'n/a'
    ;
}

1;
[download]

I ran the server, monitor.pl in another console session and just dumped the output from the cetral queue to the screen to produce the above output. You cold modify the Worker thread to

#!perl -slw
use strict;
use threads;
use Thread::Queue;
use Smart::Comments::Lite '0';

sub worker {
    require IO::Socket::INET;
    my $tid = threads->tid;
    our $running:shared;
    our $busy:shared;
    our $die:shared;
    my( $Qwork, $Qtrace ) = @_;

    ++$running;
    while( my $fno = $Qwork->dequeue() ) {
        ##1 warn "$tid($fno): Client waiting on $fno\n";
        my $client = IO::Socket::INET->new;
        $client->fdopen( $fno, '+>' )
            or die "$tid: Failed to reopen fileno: $fno";
        ##2 warn "$tid($fno): reopened $client\n";
        ++$busy;

        ##2 warn "$tid($fno): Reading from client\n";
        while( my $data = <$client> ) {
            chomp $data;
            ##2 warn "$tid($fno): Got '$data'\n";
            $Qtrace->enqueue( $data );
        }
        ##2 warn  "$tid($fno): Client disconnected\n";
        close $client;
        shutdown( $client, 2 );
        --$busy;
    }
    --$running;
}

our $START ||= 2;
our $MAX   ||= 2000;
our $running:shared = 0;
our $busy:shared    = 0;
our $die:shared     = 0;
our $connects:shared=0;

## Use ctrl-break to terminate the server.
local $SIG{INT} = sub{
    warn "SIGINT(2) terminating";
    $die = 1;
};

my $Qwork  = new Thread::Queue;
my $Qtrace = new Thread::Queue;
our %clients;

my @threads = map{
    threads->create( \&worker, $Qwork, $Qtrace )
        or warn( "Create thread $_ failed with $^E\n" ), ();
} 1 .. $START;
## This creates new threads if the current pool is running low
async{
    until( $die ) {
        if( ( $busy + 2 ) == $running and $running < $MAX ) {
            push @threads, threads->create( \&worker, $Qwork, $Qtrace 
+)
                or warn( "Create thread $_ failed with $^E\n" ), ();
        }
        sleep 1;
    }
};

## This reads and dumps the trace information to the screen.
async{
    until( $die ) {
        print $Qtrace->dequeue();
    }
}

require IO::Socket::INET;
my $server = IO::Socket::INET->new(
    LocalPort => 54321,
    Listen   => 1000,
    Reuse    => 1.
) or die $!, $^E;

## Currently, the client sockets are not cleaned up and will eventuall
+y run out of resource.
## Fixes welcome :)
my $fno;
while( not $die and ( my $client = $server->accept ) > 0 ) {
    ++$connects;
    $clients{ $client->fileno } = $client;
    $Qwork->enqueue( $client->fileno );
}
warn "\a*** Accept failed ***\a\n" and $die = 1;
close $server;

$Qwork->enqueue( undef ) for 1 .. $running;
sleep 1 while $running;
$_->join for @threads;
[download]

Smart::Comments::Lite is my doctored version of the theDamian's CPAN tool. Comment out the use line and it will disable it completely.

This was hacked together from bits of existing code in about one hour. It imposes very little load on the programs being traced and produced sequenced information that ought to make working out where your application is disappearing up it own navel fairly easy. Redirecting the output to a file, (via tee might be good), would allow you to get a permanent record of the sequence and timing of events that lead up to the problem.

HTH.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re: Win32::MMF + threads misbehavior Select or Download Code

Replies are listed 'Best First'.
Re^2: Win32::MMF + threads misbehavior by renodino (Curate) on Apr 06, 2006 at 06:46 UTC
Not certain how you conclude "imposes very little overhead", but a socket write (local or not) + a printf for every Perl statement looks like a lot of overhead to me. Esp. if the problem thread is in a tight loop (altho the socket write will block eventually, so I guess that mitigates that issue) While I appreciate your efforts, I'm already concerned about the overhead of pack()ing an integer and a float prior to writing thru the tie() for every statement, so socket writes are not an acceptable solution for my purposes.	[reply]
Re^3: Win32::MMF + threads misbehavior by BrowserUk (Patriarch) on Apr 06, 2006 at 07:17 UTC
I'm already concerned about the overhead of pack()ing an integer and a float prior to writing thru the tie() for every statement, so socket writes are not an acceptable solution for my purposes. I hate to tell you this, but tieing isn't quick. In the case of MMF, around 2 to 3 times slower than writing to a socket. #! perl -slw use strict; use Win32::MMF::Shareable; use IO::Socket::INET; use Benchmark qw[ cmpthese ]; our $DSIZE \|\|= 100; my $mmf; tie $mmf, 'Win32::MMF::Shareable', 'mmf', { namespace => 'Win32MMFTest', size => 10000, reuse => 0 }; my $sock = IO::Socket::INET->new( 'localhost:54321' ); my $data = 'X' x $DSIZE; cmpthese -1, { MMF => sub { $mmf = $data }, TCP => sub { print $sock $data } }; __END__ C:\test>MMF-IO-b -DSIZE=8 Rate MMF TCP MMF 28872/s -- -77% TCP 123636/s 328% -- C:\test>MMF-IO-b -DSIZE=80 Rate MMF TCP MMF 27401/s -- -73% TCP 102909/s 276% -- C:\test>MMF-IO-b -DSIZE=800 Rate MMF TCP MMF 28295/s -- -67% TCP 86245/s 205% -- [download] And that could probably be speeded up by playing with the buffer sizes and upping the priority of the read threads. Not certain how you conclude "imposes very little overhead", but a socket write (local or not) + a printf for every Perl statement looks like a lot of overhead to me. Esp. if the problem thread is in a tight loop (altho the socket write will block eventually, so I guess that mitigates that issue) Not sure why you think it would block? The other ends of each of those sockets are being service by a dedicated thread that reads a line and posts it to a queue. With 2 processes running 100 threads, there appears to be negligable slowdown on the app under test and each of those ~~100~~ 200 threads is in a tight loop incrementing a variable and outputting it to the console. But, I can see I am wasting your time with a NIH solution you do not want, so I'll stop. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: Win32::MMF + threads misbehavior by renodino (Curate) on Apr 06, 2006 at 15:30 UTC
I know Win32::MMF has issues (mostly because its trying to be too clever, and locking everything in sight). But it was the tool at hand for Win32...however, given its issues, it may be a wash and I'll end up writing a simple variant that just does the mmap() equivalent, sans tie()'s and locks. And, having worked on both socket and memory mgmt internals of a few OS's, I know that sockets have more overhead than a pure mmap() solution. The issue is that Win32::MMF adds a lot of extra baggage; remove the baggage, and I'd expect a mmf to be significantly faster, and much lower overhead, than sockets and thread queues; after all, its just writing to its memory. "Not sure why you think it would block?" My monitor application doesn't run all the time, only when its needed to see whats happening. So eventually, sockets are going to run out of kernel buffers, and they're going to block. I'm going to ignore Win32 for now, get the Sys::Mmap version working, and then see what it would take to deconstruct Win32::MMF back down to a minimal mmap() implementation (and maybe do the same for Sys::Mmap if needed). Who knows, maybe I'll end up with a nicely OS agnostic mmap() package for Perl.	[reply]