Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^5: shared scalar freed early

by marioroy (Prior)
on Feb 26, 2017 at 01:52 UTC ( [id://1182865]=note: print w/replies, xml ) Need Help??


in reply to Re^4: shared scalar freed early
in thread shared scalar freed early

Greetings, fellow monks.

This is a continuation of my previous post. I try the same thing with MCE::Shared. MCE::Hobo spawns threads on the Windows platform and processes otherwise, on other platforms. The two iterators return a shared object during construction. Maybe the OP can run without having to spawn a million threads. Thus, the reason for writing this and the prior post.

#!/opt/perl/bin/perl ## http://www.perlmonks.org/?node_id=1182580 use strict; use warnings; use MCE::Hobo; use MCE::Shared; use Time::HiRes 'time'; my %data = (); foreach ('a'..'z') { $data{$_} = $_ x 200; } my $chunk_size = 50; my $iterations = 100; my $num_workers = 5; my $output_path = "h.txt"; my $iter_i = Iter::Input->new($chunk_size, $iterations); my $iter_o = Iter::Output->new($output_path); test_mce_hobo(); sub test_mce_hobo { my $start = time; MCE::Hobo->create('work') for (1..$num_workers); MCE::Hobo->waitall; $iter_o->close(); printf STDERR "testa done in %0.02f seconds\n", time - $start; } # Hobo task to run in parallel sub work { while ( my ($chunk_id, $data) = $iter_i->recv() ) { my @ret = (); foreach my $chunk (@$data) { my %output = (); foreach my $key (keys %$chunk) { if ($key eq '.') { $output{$key} = $$chunk{$key}; next; } my $val = $$chunk{$key}; my $uc = uc($key); $val =~ s/$key/$uc/g; $output{$key} = $val; } push(@ret,\%output); } my $buf = ''; foreach my $data (@ret) { foreach my $key (sort keys %$data) { $buf .= $$data{$key}; } $buf .= "\n"; } $iter_o->send($chunk_id, $buf); } } ##################################################################### package Iter::Input; sub new { my ( $class, $chunk_size, $iterations ) = @_; my ( $chunk_id, $seq_a ) = ( 0, 1 ); MCE::Shared->share( bless [ $iterations, $chunk_size, \$chunk_id, \$seq_a ], $class ); } sub recv { my ( $self ) = @_; my ( $iters, $chunk_size, $chunk_id, $seq_a ) = @{ $self }; return if ( ${$seq_a} > $iters ); my @chunk; foreach my $seq_b ( 1 .. $chunk_size ) { my %retdata = %data; $retdata{'.'} = ${$seq_a} * $seq_b; push @chunk, \%retdata; } # These were made references on purpose, during construction, # to minimize array access: e.g. $self->[1]++, $self->[3]++ ${$chunk_id}++, ${$seq_a}++; return ${$chunk_id}, \@chunk; } 1; ##################################################################### package Iter::Output; sub new { my ( $class, $path ) = @_; my ( $order_id, $fh, %hold ) = ( 1 ); # Note: Do not open the file handle here, during construction. # The reason is that sharing will fail (cannot serialize $fh). MCE::Shared->share( bless [ $path, $fh, \$order_id, \%hold ], $class ); } sub send { my ( $self, $chunk_id, $output ) = @_; my ( $path, $fh, $order_id, $hold ) = @{ $self }; if ( !defined $fh ) { open $fh, '>', $path or die "open error: $!"; $self->[1] = $fh; } # hold temporarily, until orderly $hold->{$chunk_id} = $output; while (1) { last unless exists( $hold->{ ${$order_id} } ); print {$fh} delete( $hold->{ ${$order_id} } ); ${$order_id}++; } return; } sub close { my ( $self ) = @_; if ( defined $self->[1] ) { CORE::close $self->[1]; $self->[1] = undef; } return; } 1;

Regards, Mario.

Replies are listed 'Best First'.
Re^6: shared scalar freed early
by chris212 (Scribe) on Mar 06, 2017 at 17:28 UTC
    So the hobos inherit the Input and Output objects? The $fh variable is shared at construction but initialized at first use. When a hobo first calls recv and opens the file, that file handle is shared among hobos? How does buffering and position tracking work? Is it handled by an MCE server process with it's own buffering and file I/O will be done on the server process and the result returned via IPC to the hobo?

      Greetings,

      Re: Is it handled by an MCE server process with it's own buffering and file I/O will be done on the server process and the result returned via IPC to the hobo?

      Yes. The output-class constructs a shared object. Initially, the file handle ($fh) is not defined. This allows MCE::Shared->share to pass the object to the shared-manager, where it will run. The entry point to the shared-manager is the shared-object itself. Method calling and arguments are sent via IPC. The mechanism is fully wantarray aware. Any complex data during transfers involves serialization which is transparent to the application.

      Regarding IO position tracking, there is none for shared file-handles constructed via MCE::Shared. The close method was added to the class, so that any pending writes, held in a buffer by Perl, are flushed to disk before exiting; e.g. $iter_o->close().

      package Iter::Output; sub new { my ( $class, $path ) = @_; my ( $order_id, $fh, %hold ) = ( 1 ); # Note: Do not open the file handle here, during construction. # The reason is that sharing will fail (cannot serialize $fh). MCE::Shared->share( bless [ $path, $fh, \$order_id, \%hold ], $class ); } sub send { ... } sub close { ... }

      Regarding Iter::Input, the construction makes a shared object by sending it to the shared-manager process, where it is managed.

      package Iter::Input; sub new { my ( $class, $chunk_size, $iterations ) = @_; my ( $chunk_id, $seq_a ) = ( 0, 1 ); MCE::Shared->share( bless [ $iterations, $chunk_size, \$chunk_id, \$seq_a ], $class ); } sub recv { ... }

      Replacing MCE::Hobo with threads is possible. One can run MCE::Shared alongside threads and threads::shared.

      use threads; ... test_threads(); sub test_threads { my $start = time; my @thrs; # must save threads to join later push @thrs, threads->create('work') for (1..$num_workers); # wait for threads to finish $_->join for @thrs; # close the shared file handle, flushes buffer $iter_o->close(); printf STDERR "testa done in %0.02f seconds\n", time - $start; }

      Regards, Mario.

Re^6: shared scalar freed early
by chris212 (Scribe) on Feb 27, 2017 at 22:36 UTC
    Testing on Windows shows that to be a bit faster. I will look into the MCE modules. Thanks!

      Hello, chris212.

      It's always great to hear back on whether the solution posted was helpful or not. Thank you for writing. I did some extra testing myself and came to realization that your "testa" demonstration takes a very long time under Cygwin (~ 40x longer) versus running under Strawberry Perl. On the contrary, the two MCE solutions completed in the same time frame for Cygwin and Strawberry Perl.

      Kind regards, Mario.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1182865]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2024-04-25 09:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found