in reply to Re^2: Parallel::ForkManager for any array
in thread Parallel::ForkManager for any array
Possibly you could improve performance by getting a list of files to be deleted with their full path name, and then allowing the workers to loop through that list, rather than handing them a top-level "file" that may be a directory. That would even out the workload among workers.
If this is really a problem that can benefit from parallelization, i.e. there are really a lot of files, and the task is not I/O bound (as I suspect), I would consider also using a technique that employs chunking, so each worker is given a block of files to process before pulling the next one, such as is provided by default by the excellent parallelization engine MCE. The following is untested and lacks error checking, debug output, etc., but should give you some ideas:
use strict; use warnings; use Path::Iterator::Rule; use MCE; my $rule = Path::Iterator::Rule->new; # Add constraints to the rule here my $root = '/some/path'; my $iter = $rule->iter($root, { depthfirst => 1 }); my @list; while ( my $file = $iter->() ) { push @list, $file; } my $chunk_size = 100; # whatever makes sense for you MCE->new( user_func => \&task, max_workers => 10, chunk_size => $chunk +_size ); MCE->process( \@list ); MCE->shutdown; exit 0; sub task { my $mce = shift; # not used in this case my $file = shift; unlink($file) if -f $file; rmdir($file) if -d $file; } __END__
Hope this helps!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Parallel::ForkManager for any array
by MissPerl (Sexton) on Oct 17, 2018 at 15:11 UTC |