Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^4: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run

by Anonymous Monk
on Apr 07, 2017 at 20:38 UTC ( [id://1187427]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run
in thread Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run

Thanks 1nickt. Regarding the use of the END block, no problem and why not use it. MCE::Signal is the reason for why the END block is not called for the parent process. It ends up doing a KILL signal. Fortunately, one can disable that by loading MCE::Signal before other MCE modules and pass the -no_kill9 option. Now that the END block is working, the handlers for the parent process are no longer needed. Nor the TERM handler for MCE workers. The script now looks like this.

Regarding shared objects, having OO and auto-dereferencing on the fly makes it so natural versus calling tied(%hash)->method.

use strict; use warnings; use feature 'say'; use Data::Dumper; ++$Data::Dumper::Sortkeys; use MCE::Signal qw( -no_kill9 ); use MCE::Loop; use MCE::Shared; $|++; my $pid = $$; say "Parent PID $pid"; my $hash = MCE::Shared->hash(); MCE::Loop->init( max_workers => 2, chunk_size => 1, user_begin => sub { $SIG{'INT'} = sub { my $signal = shift; say "Hello from $signal: $$"; MCE->exit(0); }; } ); mce_loop { my ( $mce, $chunk_ref, $chunk_id ) = @_; say sprintf 'worker %s (%s) processing chunk %s', MCE->wid, MCE->p +id, $chunk_id; for ( @{ $chunk_ref } ) { $hash->{ sprintf '%.2d %s', $_, $$ } = time; sleep 2; } } ( 0 .. 6 ); MCE::Loop->finish; END { say "Hello from END block: $$"; if ( $$ == $pid ) { say 'Parent in END: ' . Dumper $hash->export; } }

I will test the change to MCE::Shared::Server. It's not feasible to handle both situations I'm not sure. However, it seems important for the shared-server to stick around longer to handle requests made inside handlers and END blocks.

sub _loop { $_is_client = 0; # $SIG{HUP} = $SIG{INT} = $SIG{QUIT} = $SIG{TERM} = sub { # $SIG{INT} = $SIG{$_[0]} = sub { }; # # CORE::kill($_[0], $_is_MSWin32 ? -$$ : -getpgrp); # for my $_i (1..15) { sleep 0.060 } # # CORE::kill('KILL', $$); # CORE::exit(255); # }; $SIG{HUP} = $SIG{INT} = $SIG{QUIT} = $SIG{TERM} = sub { }; ... }

It sure is nice to have the END block working. But it requires loading MCE::Signal qw( -no_kill9 ) before MCE modules to take effect. Signal handling is not fun. Likewise, parallel programming is crazy. But, with your help 1nickt, it's almost there.

We're on the road. Amazingly Wifi from the laptop via the phone is working well.

Thank you, 1nickt. Thank you, Perlmonks.

  • Comment on Re^4: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run
  • Select or Download Code

Replies are listed 'Best First'.
Re^5: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run
by Anonymous Monk on Apr 07, 2017 at 22:36 UTC

    Am validating the change for _loop in MCE::Shared::Server. Also, in _stop (around line 370), need to comment out two lines. Otherwise, the shared-manager process may linger around as a zombie process.

    sub _stop { return unless ($_is_client && $_init_pid && $_init_pid eq "$$.$_tid +"); # return if ($INC{'MCE/Signal.pm'} && $MCE::Signal::KILLED); # return if ($MCE::Shared::Server::KILLED); ... }

      Hi Mario,

      Yes, this code with the two changes to MCE::Shared::Server is working exactly as expected when run to completion, and when interrupted with CTL-C. Nice!

      For completion I wanted to test what would happen with an uncaught exception, so I added a fatal operation to the loop:

      for ( @{ $chunk_ref } ) { say $_ / 0 if $_ == 4;
      ... which resulted in the worker dying with the expected exception and message from Perl, while the manager and the other workers continued to completion, including updating the shared hash:
      perl mce10.pl Parent PID 29953 worker 2 (29957) processing chunk 1 worker 1 (29956) processing chunk 2 worker 2 (29957) processing chunk 3 worker 1 (29956) processing chunk 4 worker 1 (29956) processing chunk 6 worker 2 (29957) processing chunk 5 Illegal division by zero at mce10.pl line 27, <__ANONIO__> line 6. Hello from END block: 29957 worker 1 (29956) processing chunk 7 Hello from END block: 29956 Hello from END block: 29953 Parent in END: $VAR1 = bless( { '00 29957' => '1491661589', '01 29956' => '1491661589', '02 29957' => '1491661591', '03 29956' => '1491661591', '05 29956' => '1491661593', '06 29956' => '1491661595' }, 'MCE::Shared::Hash' );
      ... note the missing key for #4.

      I think that this is a good optional behaviour. But I think it would be nice to have the default case be: that an uncaught exception kills the whole program. One could choose to have the manager ignore an exception in a worker process, via a switch of some kind (maybe an option to MCE::Signal ?). But in that case I think it would be important to document the behaviour as demonstrated above, so users can know that the shared data cache will not necessarily contain all the expected data.

      So that by default one can count on: either the shared data structure being populated as expected, or an exception ... a partially-populated data structure should only be provided on demand and with a warning.

      Thank you again.


      The way forward always starts with a minimal test.

        Hi 1nickt,

        To get the default behavior, one can specify the on_post_exit option. The status code for __DIE__ is 255 typically.

        MCE::Loop->init( max_workers => 2, chunk_size => 1, user_begin => sub { $SIG{'INT'} = sub { my $signal = shift; say "Hello from $signal: $$"; MCE->exit(0); }; }, on_post_exit => sub { my ($mce, $e) = @_; if ($e->{status} == 255) { MCE::Signal::stop_and_exit('__DIE__'); } } );

        More info on on_post_exit is found here. The die handler for MCE workers is found inside MCE::Core::Worker ( ~ line 649 ). I cannot change the MCE->exit(...) line to MCE::Signal::stop_and_exit('__DIE__'). That will break scripts where MCE is called from inside an eval block.

        local $SIG{__DIE__} = sub { ... local $SIG{__DIE__}; local $\ = undef; my $_die_msg = (defined $_[0]) ? $_[0] : ''; print {*STDERR} $_die_msg; $self->exit(255, $_die_msg, $self->{_chunk_id}); };

        TODO: When on_post_exit is not specified, have MCE workers abort input due to uncaught exception. Revisit eval. I was unable to get $@ to stick at the manager level. To make this work, I need to call die with the error obtained from the worker at the manager level.

        eval { mce_loop { ... } @input }; # TODO: Today, $@ is not set at the manager level. # Thus, the eval block succeeds. Will fix this. if ( $@ ) { ... }

        Fortunately, one has control with the on_post_exit handler on what to do: e.g. restart_worker, stop_and_exit.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1187427]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-26 03:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found