georgi has asked for the wisdom of the Perl Monks concerning the following question:
I'm hoping that somebody out there can help me with what's been a real bugbear for the last few days:
BACKGROUND:
I need to write a subroutine that (among other things) turns thousands of TIF files into dozens of PDF files on a 4x intel multi-processor machine running MS2003 server. I'm using activestate 5.8, and when I try to parallelize this process, it seems that my filehandles are getting horribly crossed when I really don't think they should be. In the end, it would appear that the parent thread's printing to STDOUT is appearing in the child threads filehandle... (?!?)
I hope somebody here can give me the insight I need, in order to be able to get this working...
APPROACH:
The proper solution (rather than ImageMagick, which is prohibitively inefficient), is:
I've tried using Parallel::ForkManager originally, and (lately) ithreads, and both exhibited the same strange behavior; namely, unrelated filehandles appear to get crossed in different processes... (!!)
CODE:
############################################################ # ConvertTiffsToPS # # Convert a list of tiff files to PS files # # RETURNS: # A list of the paths of the postscript files ############################################################ use threads; use threads::shared; sub ConvertTiffsToPS { my (@tiffpaths) = @_; my @pspaths; # to be filled in. my @threads; foreach my $tiffpath (@tiffpaths){ (my $pspath = $tiffpath) =~ s/\.tiff?$/\.PS/i; push @pspaths, $pspath; print "Converting $tiffpath to $pspath\n"; my $thread = async { Tiff2PS($tiffpath, $pspath) }; push @threads,$thread; } print("Waiting for child processes..."); print join(",\n",map {$_->join()} @threads); print("child processes complete..."); confess "Somethings wrong".Dumper(\@tiffpaths,\@pspaths) unless (scalar(@tiffpaths) == scalar(@pspaths)); # return the list of PS files return @pspaths; } # Take a tifffile, and produce a PS file next to it. sub Tiff2PS { my ($tiffpath,$pspath) = @_; #local $| = 1; # has no visible effect either way... print qq("$TIFF2PS_COMMAND" "$tiffpath" |\n); open TIFF2PS, qq("$TIFF2PS_COMMAND" "$tiffpath"|) or confess "Can't run $TIFF2PS_COMMAND"; open PSOUT, ">$pspath" or confess "Can't open $pspath for writing!\n"; my $flag = 0; foreach my $line (<TIFF2PS>) { print PSOUT $line; # Add the following line only once... if (!$flag && $line =~ /^%%BoundingBox: (\d+) (\d+) (\d+) (\d+ +)/o) { my ($w,$h) = ($3-$1, $4-$2); # Fix the pagesize, since GS wants everything to be 8.5x11 +" portrait print PSOUT "<< /PageSize [$w $h] >> setpagedevice\n"; # Short-circuit prevents expensive regexp match afterwards $flag = 1; } } close TIFF2PS; close PSOUT; if (! -e $pspath) { warn("**TIFF2PS Problem: $!"); confess "Tiff2PS Error: $!"; } else { print "$pspath created...\n"; } return $pspath; }
While dozens of tiff files end up being converted correctly, at least one of them always seems to look like this:
Converting c:/4-1-13/00000003/00000014.TIF to c:/4-1-13/00000003/00000 +014.PS %!PS-Adobe-3.0 EPSF-3.0 %%Creator: tiff2ps %%Title: c:/4-1-13/00000003/00000013.TIF %%CreationDate: Sat Sep 03 16:41:24 2005 %%DocumentData: Clean7Bit ...
When I do the same thing with Parallel::ForkManager, I see the same result (though it appears to hang interminably on occaision). If I use Parallel::ForkManager with 0 forks (for debugging), it works just fine...
What's going on here? How can I avoid this craziness?
Thanks in advance for your advice...
--Georgi
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: win32, threads, and filehandles, oh my!!
by BrowserUk (Patriarch) on Sep 03, 2005 at 23:15 UTC | |
|
Re: win32, threads, and filehandles, oh my!!
by GrandFather (Saint) on Sep 04, 2005 at 00:11 UTC | |
by BrowserUk (Patriarch) on Sep 04, 2005 at 01:07 UTC | |
|
Re: win32, threads, and filehandles, oh my!!
by davidrw (Prior) on Sep 04, 2005 at 00:03 UTC |