worker threads - one does all the work

jvuman has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: worker threads - one does all the work by BrowserUk (Patriarch) on Jun 08, 2017 at 22:01 UTC
regardless of the number of worker threads I create, one of them does most - about 70% - of the work. When the thread that processes the last trap finishes with it, it is still running (has a timeslice), so it immediately loops back and attempts to obtain the lock. Most of the time it will succeed because none of the other threads are running at that moment in time. The other threads will only get a look in if this thread is swapped out; and that will only happen if it takes longer than its timeslice to process the previous trap. I'm not overly familiar with *nix system priorities and scheduling, but the idea of using a file system lock, even if it is cached, as a distribution mechanism for network IO traffic seems a little like putting a lollipop lady on a motorway. "Scalable" isn't the word that comes to mind here. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity. In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit	[reply]
Re: worker threads - one does all the work by marioroy (Prior) on Jun 09, 2017 at 03:56 UTC
Update: Previously, I noted the time after enqueuing 10k traps into the queue. Unfortunately, I didn't factor the traps still pending in the queue and met to report the duration time after the queue is depleated. I've gone through and corrected all my posts in this thread. Hello jvuman and welcome to the amazing monastery. Thank you for introducing Net::SNMPTrapd which I've not used before. It is possible to run a powerful trap server using a single listener and many consumers. One might do so using threads and Thread::Queue or similarly with MCE::Flow and MCE::Queue. The latter is provided below. Here, I have each consumer sleep for 4 milliseconds to simulate work. Awaiting on the queue is a safety measure to prevent the queue from consuming gigabytes of memory in the event receiving millions of traps. Please adjust the threshold to your satisfaction. In my testing, the server process never entered the pending if statement. MCE and MCE::Shared (not used here) involve IPC behind the scene. Fetching is faster from a BSD OS (e.g. FreeBSD, darwin) when compared to Linux. See this post for more info. `perl snmp_test.pl >/dev/null duration: 1.013 seconds` [download] That's seems fast considering that 2 producers share CPU time with the listener process and consumers. use strict; use warnings; use feature 'say'; use Time::HiRes qw( sleep time ); use Net::SNMPTrapd; use MCE::Flow; use MCE::Queue; my $queue = MCE::Queue->new( await => 1, fast => 1 ); my $start = time; # floating seconds # from left to right: 1 listener, 30 consumers, and 2 producers mce_flow { max_workers => [ 1, 30, 2 ] }, \&listener, \&consumer, \&pr +oducer; printf {STDERR} "duration: %0.3f seconds\n", time() - $start; exit(0); # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # listener and consumer roles # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sub listener { my $snmptrapd = Net::SNMPTrapd->new( ReusePort => 1 ); my $count = 0; while ( 1 ) { my $trap = $snmptrapd->get_trap(); if ( !defined $trap ) { printf {STDERR} "$0: %s\n", Net::SNMPTrapd->error(); next; } elsif ( $trap == 0 ) { next; } # important, remove the file handle inside the object # so that serialization into the queue succeeds delete $trap->{_UDPSERVER_}; # enqueue the trap for a consumer to process $queue->enqueue($trap); # await will block if the queue has more than 2000 pending # this prevents the queue from consuming memory out of control # simply increase the number of consumers to not block # # $queue->await(2000) if ( ++$count % 4000 == 0 ); if ( ( ++$count % 4000 == 0 ) && $queue->pending() > 3000 ) { say {*STDERR} "$0: WARN: blocking temporarily"; $queue->await(2000); } # reset the counter to not overflow $count = 0 if ( $count > 2e9 ); # for benchmarking, leave the loop after 4000 traps last if ( $count == 4000 ); } # notify the manager process to stop the producer and queue MCE->do('quit_producers'); } sub consumer { while ( defined ( my $trap = $queue->dequeue() ) ) { $trap->process_trap(); # printf "[$$] %s\t%i\t%i\t%s\n", # $trap->remoteaddr, $trap->remoteport, # $trap->version, $trap->community; say "[$$] ".$trap->varbinds->[5]->{'1.3.6.1.4.1.50000.1.6'}; sleep 0.004; } } # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # producer role for generating traps # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ my @producer_pids; sub set_producer_pid { push @producer_pids, $_[0]; } sub quit_producers { kill 'QUIT', @producer_pids; $queue->end(); } sub producer { require Net::SNMP; my ( $run, $i ) = ( 1, 0 ); $SIG{QUIT} = sub { $run = 0 }; # notify the manager process my pid MCE->do('set_producer_pid', $$); my ( $session, $error ) = Net::SNMP->session( -hostname => 'localhost', -version => 2, -community => 'public', -port => 162 ); if ( !defined $session ) { printf "Error: Starting SNMP session (v2c trap) - %s\n", $erro +r; return; } while ( $run ) { my $result = $session->snmpv2_trap( -varbindlist => [ '1.3.6.1.2.1.1.3.0', 0x43, int( time() ), '1.3.6.1.6.3.1.1.4.1.0', 0x06, '1.3.6.1.4.1.50000', '1.3.6.1.4.1.50000.1.3', 0x02, 1, '1.3.6.1.4.1.50000.1.4', 0x04, 'String', '1.3.6.1.4.1.50000.1.5', 0x06, '1.2.3.4.5.6.7.8.9', '1.3.6.1.4.1.50000.1.6', 0x40, '10.10.10.'.((++$i % 1 +00) + 1), '1.3.6.1.4.1.50000.1.7', 0x41, 32323232, '1.3.6.1.4.1.50000.1.8', 0x42, 42424242, '1.3.6.1.4.1.50000.1.9', 0x43, int( time() ), '1.3.6.1.4.1.50000.1.10', 0x44, 'opaque data' ] ); } $session->close; } [download] In reality, the server process can handle more than 4,000 traps per second simply by running the server and consumers only. `mce_flow { max_workers => [ 1, 30 ] }, \&server, \&consumer;` [download] Regards, Mario.	[reply] [d/l] [select]
Re^2: worker threads - one does all the work by marioroy (Prior) on Jun 09, 2017 at 04:48 UTC
Update: I had a chance to run this on a Linux machine. IPC wise (fetching - dequeue) is slower from a Linux OS than from a BSD variant: e.g. FreeBSD, darwin. Running with threads also takes longer. See this post for more info. Below, I've updated the main script to show the enqueue time, pending count, and finally the duration time. A production environment might have a load balancer and 4 pizza-box (1-inch) servers. Together, the 4 servers can handle 1 million traps per minute if that's the scale needed. I reached 4.5k traps per second on my Linux machine. Update: Loading threads at the top of the script will have MCE spawn threads instead. Doing so may cause consumers to run slower for some reason on Linux. Maybe it's from running an older Perl 5.16.3 release. I'm not sure. For maximum performance check to see if Perl has Sereal installed. MCE defaults to Sereal::Encoder 3.015+ and Sereal::Decoder 3.015+ for serealization when available. `use threads;` [download] Update: I've updated the code to synchronize STDOUT and STDERR output to the manager process. Furthermore, specified user_output and user_error MCE options so that one can send to a logging routine if need be. Omitting them will have the manager process write directly to STDOUT and STDERR respectively. Hello again jvuman, Generating traps is handled by another script. Doing this allowed me to move the trap generator (producer) to another machine. My laptop running the server script can process 6k per second. use strict; use warnings; # perl snmp_server.pl \| wc -l use Time::HiRes qw( sleep time ); use Net::SNMPTrapd; use MCE::Flow; use MCE::Queue; my $queue = MCE::Queue->new( await => 1, fast => 1 ); my $max_consumers = 30; my $start = 0; MCE::Flow::init { user_output => sub { print {STDOUT} $_[0]; }, user_error => sub { print {STDERR} $_[0]; }, }; mce_flow { max_workers => [ 1, $max_consumers ] }, \&listener, \&consu +mer; printf {STDERR} "duration : %0.03f seconds\n", time - $start; exit(0); sub set_start { $start = $_[0]; } sub listener { my $snmptrapd = Net::SNMPTrapd->new( ReusePort => 1 ); my $count = 0; my $start; while ( 1 ) { my $trap = $snmptrapd->get_trap(); if ( !defined $trap ) { MCE->printf(\STDERR, "$0: %s\n", Net::SNMPTrapd->error()) +; next; } elsif ( $trap == 0 ) { next; } $start = time(), MCE->do('set_start', $start) unless $start; # important, remove the file handle inside the object delete $trap->{_UDPSERVER_}; # enqueue the trap for a consumer to process $queue->enqueue($trap); # leave the loop after 10,000 traps last if ( ++$count >= 10000 ); # safety to prevent the queue from consuming memory out of con +trol # $queue->await(2000) if ( $count % 4000 == 0 ); # if ( ( $count % 4000 == 0 ) && $queue->pending() > 3000 ) { # MCE->say(\STDERR, "Warn: blocking temporarily"); # $queue->await(2000); # } # reset the counter to not overflow $count = 0 if ( $count > 2e9 ); } $queue->enqueue((undef) x $max_consumers); MCE->printf(\STDERR, "enqueue : %0.03f seconds\n", time - $start +); MCE->printf(\*STDERR, "pending : %d\n", $queue->pending()); } sub consumer { while ( defined ( my $trap = $queue->dequeue() ) ) { $trap->process_trap(); MCE->printf( "[$$] %s\t%i\t%i\t%s\n", $trap->remoteaddr, $trap->remoteport, $trap->version, $trap->community ); sleep 0.004; } } [download] Here is the producer script for generating traps in parallel. This is useful for load testing. use strict; use warnings; # perl snmp_producer.pl use Net::SNMP; use MCE::Flow; use MCE::Queue; mce_flow { max_workers => 2 }, \&producer; exit(0); sub producer { my ( $session, $error ) = Net::SNMP->session( -hostname => '127.0.0.1', -version => 2, -community => 'public', -port => 162 ); if ( !defined $session ) { printf "Error: Starting SNMP session (v2c trap) - %s\n", $erro +r; return; } for my $i ( 1 .. 5000 ) { my $result = $session->snmpv2_trap( -varbindlist => [ '1.3.6.1.2.1.1.3.0', 0x43, int( time() ), '1.3.6.1.6.3.1.1.4.1.0', 0x06, '1.3.6.1.4.1.50000', '1.3.6.1.4.1.50000.1.3', 0x02, 1, '1.3.6.1.4.1.50000.1.4', 0x04, 'String', '1.3.6.1.4.1.50000.1.5', 0x06, '1.2.3.4.5.6.7.8.9', '1.3.6.1.4.1.50000.1.6', 0x40, '10.10.10.'.((++$i % 1 +00) + 1), '1.3.6.1.4.1.50000.1.7', 0x41, 32323232, '1.3.6.1.4.1.50000.1.8', 0x42, 42424242, '1.3.6.1.4.1.50000.1.9', 0x43, int( time() ), '1.3.6.1.4.1.50000.1.10', 0x44, 'opaque data' ] ); } $session->close; } [download] To benchmark, run snmp_server.pl on machine A. Then, run snmp_producer.pl on machine B. Remember to change the IP address to host A inside producer if running on another host. `Host A : perl snmp_server.pl \| wc -l Host B : perl snmp_producer.pl` [download] Regards, Mario.	[reply] [d/l] [select]
Re^3: worker threads - one does all the work by marioroy (Prior) on Jun 09, 2017 at 12:04 UTC
The following does similarly to what the OP described. I've experienced lost traps on Linux, causing the server script to never leave the loop. On the Mac, it takes 3.822 seconds to process 10,000 traps when successful. use strict; use warnings; use threads; use threads::shared; use Time::HiRes qw( sleep time ); use Net::SNMPTrapd; use MCE::Flow; use MCE::Queue; my $tid = 0; sub CLONE { $tid = threads->tid(); } my $snmptrapd = Net::SNMPTrapd->new( ReusePort => 1 ); my $max_workers = 30; my $count : shared = 0; my $start : shared; MCE::Flow::init { user_output => sub { print {STDOUT} $_[0]; }, user_error => sub { print {STDERR} $_[0]; }, }; mce_flow { max_workers => $max_workers }, \&server; printf {STDERR} "duration: %0.03f seconds\n", time - $start; exit(0); sub server { my $done = 0; while ( 1 ) { my $trap; { lock $count; $start = time unless $start; $trap = $snmptrapd->get_trap(); $done = 1 if ( ++$count > 10000 - $max_workers ); } if ( !defined $trap ) { MCE->printf(\STDERR, "$0: %s\n", Net::SNMPTrapd->error()) +; next; } elsif ( $trap == 0 ) { next; } $trap->process_trap(); MCE->printf( "[%02d] %s\t%i\t%i\t%s\n", $tid, $trap->remoteaddr, $trap->remoteport, $trap->version, $trap->community ); last if $done; sleep 0.004; } } [download] Regards, Mario.	[reply] [d/l]
Re^2: worker threads - one does all the work by zentara (Cardinal) on Jun 09, 2017 at 13:52 UTC
Wow, Mario, I'm so glad you are posting many workable templates for MCE. I'm saving them all. You have a Cookbook yet? I'm not really a human, but I play one on earth. ..... an animated JAPH	[reply]
Re^3: worker threads - one does all the work by marioroy (Prior) on Jun 10, 2017 at 06:07 UTC
Hi zentara. Started something on github but am not liking the format and stopped temporarily. At some point will renew the documentation.	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: worker threads - one does all the work by talexb (Chancellor) on Jun 09, 2017 at 15:07 UTC
This is a really interesting post, and reminds me of doing assembler programming using Interrupt Service Routines in the 70's and 80's. To handle the highest possible rate of interrupts, the ISR code needed to be as brief as possible -- it would wake up, grab the data that's just arrived, and stuff it into a circular buffer, for example. The data might be a single character, several characters, or even an entire message. Somewhere else, an idle loop would be watching the same circular buffer for activity, and as soon as something arrived, it would deal with it at normal priority. These two activities live in different worlds -- one doing as little as possible, as quickly as possible, and the other doing the needful, as data arrived. Ideally, this would avoid the situation where interrupts happen faster than they could be processed, resulting in dropped events, and therefore lost data. Applying this to your situation, I might have the parent handle the clunkier processing, and have the child handle the SNMP traps .. but if there are plenty of solutions already in CPAN, that would probably be a better way forward. Thanks again for the intriguing post. Alex / talexb / Toronto Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.	[reply]
Re^2: worker threads - one does all the work by marioroy (Prior) on Jun 10, 2017 at 07:39 UTC
Hi talexb, Regarding the MCE module, the main process enters a loop to handle IPC events while running. The following is a MCE::Hobo and MCE::Shared demonstration, based on the MCE::Flow solution. Here, MCE::Shared spawns a background process to handle IPC events. This allows the main process to listen for traps. use strict; use warnings; # perl snmp_server.pl \| wc -l use Time::HiRes qw( sleep time ); use Net::SNMPTrapd; use MCE::Hobo; use MCE::Shared; my $queue = MCE::Shared->queue( await => 1, fast => 1 ); my $start = 0; # construct two shared handles, prevents garbled output # consumers send output to the shared-manager process mce_open my $outfh, '>>', \STDOUT; mce_open my $errfh, '>>', \STDERR; MCE::Hobo->create(\&consumer) for 1 .. 30; listener(); MCE::Hobo->waitall(); printf {$errfh} "duration : %0.03f seconds\n", time - $start; exit(0); sub listener { my $snmptrapd = Net::SNMPTrapd->new( ReusePort => 1 ); my $count = 0; while ( 1 ) { my $trap = $snmptrapd->get_trap(); if ( !defined $trap ) { printf {$errfh} "$0: %s\n", Net::SNMPTrapd->error(); next; } elsif ( $trap == 0 ) { next; } $start = time() unless $start; # important, remove the file handle inside the object delete $trap->{_UDPSERVER_}; # enqueue the trap for a consumer to process $queue->enqueue($trap); # leave the loop after 10,000 traps last if ( ++$count >= 10000 ); # reset the counter to not overflow $count = 0 if ( $count > 2e9 ); } $queue->end(); printf {$errfh} "enqueue : %0.03f seconds\n", time - $start; printf {$errfh} "pending : %d\n", $queue->pending(); } sub consumer { while ( defined ( my $trap = $queue->dequeue() ) ) { $trap->process_trap(); printf {$outfh} "[$$] %s\t%i\t%i\t%s\n", $trap->remoteaddr, $trap->remoteport, $trap->version, $trap->community; sleep 0.004; } } [download] For comparison, the following provides a threads and Thread::Queue demonstration. Fortunately, one may run MCE::Shared alongside threads to get shared-handles support. Please note that this demonstration requires freezing and thawing at the application level. Serialization is typically automatic for MCE and MCE::Shared solutions. use strict; use warnings; # perl snmp_server.pl \| wc -l use threads; use Thread::Queue; use Time::HiRes qw( sleep time ); use Storable qw( freeze thaw ); use Net::SNMPTrapd; use MCE::Shared; my $queue = Thread::Queue->new(); my $max_consumers = 30; my $start = 0; mce_open my $outfh, '>>', \STDOUT; mce_open my $errfh, '>>', \STDERR; threads->create(\&consumer) for 1 .. $max_consumers; listener(); $_->join for threads->list(); printf {$errfh} "duration : %0.03f seconds\n", time - $start; exit(0); sub listener { my $snmptrapd = Net::SNMPTrapd->new( ReusePort => 1 ); my $count = 0; while ( 1 ) { my $trap = $snmptrapd->get_trap(); if ( !defined $trap ) { printf {$errfh} "$0: %s\n", Net::SNMPTrapd->error(); next; } elsif ( $trap == 0 ) { next; } $start = time() unless $start; # important, remove the file handle inside the object delete $trap->{_UDPSERVER_}; # enqueue the trap for a consumer to process $queue->enqueue( freeze($trap) ); # leave the loop after 10,000 traps last if ( ++$count >= 10000 ); # reset the counter to not overflow $count = 0 if ( $count > 2e9 ); } # $queue->end(); # newer Thread::Queue and MCE::Shared releases $queue->enqueue((undef) x $max_consumers); # older releases printf {$errfh} "enqueue : %0.03f seconds\n", time - $start; printf {$errfh} "pending : %d\n", $queue->pending(); } sub consumer { while ( defined ( my $item = $queue->dequeue() ) ) { my $trap = thaw($item); $trap->process_trap(); printf {$outfh} "[$$] %s\t%i\t%i\t%s\n", $trap->remoteaddr, $trap->remoteport, $trap->version, $trap->community; sleep 0.004; } } [download] On my Linux box, the MCE::Hobo and MCE::Shared demonstration completes 10k traps in 2.2 seconds. The threads and Thread::Queue demonstration needs more time, unexpectingly and completes in 13.4 seconds. Notice the difference with the number of traps pending in the queue. `# perl snmp_hobo.pl \| wc -l enqueue : 1.982 seconds pending : 1085 duration : 2.229 seconds 10000 # perl snmp_thr.pl \| wc -l enqueue : 1.987 seconds pending : 8510 duration : 13.372 seconds 10000` [download] The following is the trap generator used to feed both demonstrations. To not impact the listener/consumer script, run this from another host. use strict; use warnings; use Net::SNMP; use MCE::Flow; use MCE::Queue; mce_flow { max_workers => 2 }, \&producer; exit(0); sub producer { my ( $session, $error ) = Net::SNMP->session( -hostname => '192.168.0.16', -version => 2, -community => 'public', -port => 162 ); if ( !defined $session ) { printf "Error: Starting SNMP session (v2c trap) - %s\n", $erro +r; return; } for my $i ( 1 .. 5000 ) { my $result = $session->snmpv2_trap( -varbindlist => [ '1.3.6.1.2.1.1.3.0', 0x43, int( time() ), '1.3.6.1.6.3.1.1.4.1.0', 0x06, '1.3.6.1.4.1.50000', '1.3.6.1.4.1.50000.1.3', 0x02, 1, '1.3.6.1.4.1.50000.1.4', 0x04, 'String', '1.3.6.1.4.1.50000.1.5', 0x06, '1.2.3.4.5.6.7.8.9', '1.3.6.1.4.1.50000.1.6', 0x40, '10.10.10.'.((++$i % 1 +00) + 1), '1.3.6.1.4.1.50000.1.7', 0x41, 32323232, '1.3.6.1.4.1.50000.1.8', 0x42, 42424242, '1.3.6.1.4.1.50000.1.9', 0x43, int( time() ), '1.3.6.1.4.1.50000.1.10', 0x44, 'opaque data' ] ); } $session->close; } [download] Regarding IPC fetch-requests, I've tried for MCE and MCE::Shared on Linux to reach closer to BSD levels. Threads is a mystery sometimes. I'm not sure why threads is running slow under Red Hat / CentOS 7.3 - Perl v5.16.3. `# Linux: MCE::Hobo - MCE::Shared->queue, Perl v5.16.3 enqueue : 1.982 seconds pending : 1085 duration : 2.229 seconds 10000 # Linux: threads - Thread::Queue, Perl v5.16.3 enqueue : 1.987 seconds pending : 8510 duration : 13.372 seconds 10000 # Mac OS X: MCE::Hobo - MCE::Shared->queue, Perl v5.18.2 enqueue : 1.589 seconds pending : 191 duration : 1.632 seconds 10000 # Mac OS X: threads - Thread::Queue, Perl v5.18.2 enqueue : 1.620 seconds pending : 128 duration : 1.867 seconds 10000` [download] Regards, Mario	[reply] [d/l] [select]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: worker threads - one does all the work by locked_user sundialsvc4 (Abbot) on Jun 08, 2017 at 20:39 UTC
I am skeptical that a multi-worker approach would in fact be beneficial if all of the data is coming in strictly from one TCP/IP port. I am skeptical that the additional overhead of the approach that you suggest here might in fact just slow it down. If the work that needs to be done upon receipt of any trap is “non-trivial,” then you might have one “listener” thread that does nothing more than toss the request into a queue for consumption by a second thread or pool of threads, leaving the listener free to process the traps as quickly as they arrive without waiting for any of them to be processed. The latency of the system will be very consistent and very low, even under load. Furthermore, there are several existing frameworks for building “all the necessary plumbing” � including the venerable `POE` and a variety of thread-safe queues. Everything you might need to set up worker-pools, queues, and to manage the whole thing are already available in CPAN so that you will not start from scratch.