rudds_perl_habit has asked for the wisdom of the Perl Monks concerning the following question:
I have some successful scripts that use threads, but they all have a set number of threads that they create. Now I am trying to create a script that will create X number of threads where the number of threads is determined by the number of search directories. In this particular case, each thread runs a "cleartool find" command on a directory to get an array of results back. For this example I am just using the unix find command. But the "cleartool find" command in ClearCase is similar, but takes a lot longer to run.
So what I am finding is that on small data it seems to work fine. I get consistent results. But on those really long running clearcase commands, I don't always get all the data I expect in the @Final array. There is probably a way to do this better... maybe locking the variable before I update it? I was thinking that each thread is updating a different hash key of the variable, so it should be safe to update this way? Or does it need to be locked before each join statement? Any suggestions on how to do this better?
#!/usr/local/bin/perl use Cwd; use threads; use Data::Dumper; my $use_cc = 0; my @dirs = (); if ( $use_cc ) { @dirs = split(/\s+/, $ENV{CLEARCASE_AVOBS}); } else { @dirs = qw(/bin /sbin /usr/local/bin /usr/sfw/bin /usr/bin); } # it's a clearcase thing my $branch = "v4.0.0_gxp_patch"; # hash of dir names with thread values my %threads = (); # hash of dir names with arrays of found items my %Found = (); # large arry to hold all results my @Final = (); foreach my $dir ( sort @dirs ) { chomp($dir); # add dir name to hash $Found{$dir} = (); # create thread and add it to threads hash $threads{$dir} = threads->create({'context' => 'list'}, 'find_thread +', $dir, $use_cc, $branch); } foreach my $dir ( sort keys %threads ) { # cycle through threads hash and join up results, put them in hash-o +f-arrays @{ $Found{$dir} } = $threads{$dir}->join(); } # still all the smaller hash-of-arrays into a large array for easier p +rocessing later on foreach my $dir ( sort keys %Found ) { foreach my $item ( sort @{ $Found{$dir} } ) { push(@Final, $item); } } print Dumper(@Final); print "SIZE: " . scalar(@Final) . "\n"; sub find_thread { my $dir = shift; my $cc_flag = shift; my $branch = shift; my @results; chdir $dir or die "Cannot change to $dir\n"; print "Finding all files in dir: $dir\n"; if ( $cc_flag ) { @results = `cleartool find -all -version 'brtype($branch)' -print +2>&1`; } else { @results = `find $dir -print 2>&1`; } return @results; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: creating unknown number of threads and then join results
by BrowserUk (Patriarch) on Jul 29, 2013 at 23:42 UTC | |
by rudds_perl_habit (Novice) on Jul 30, 2013 at 16:36 UTC | |
by BrowserUk (Patriarch) on Jul 30, 2013 at 18:32 UTC | |
by rudds_perl_habit (Novice) on Jul 30, 2013 at 18:43 UTC | |
by BrowserUk (Patriarch) on Jul 30, 2013 at 18:54 UTC | |
by Anonymous Monk on Jul 31, 2013 at 00:22 UTC |