Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hey all
I posted a question on this before, and was asked to post my code to better illustrate what's going on. I am trying to set up such that when this code for processing files with unix commands reaches a certain point, it should fork off two processes to deal with two different file types. Ideally after those forks are complete, it would then resume the initial script. Currently, I find that one of the two forks ends up running twice. Also, the script exits as soon as the forks begin (thus why I finished the code in the forks). If someone can point out my errors so I can work this as I'd like, that would be great.
thank you all, and sorry for the sloppy code. I've only been doing this a year as of next week :)
#!/opt/local/bin/perl -w open(STDERR, ">errors.log") || die "cannot create error log"; $dir = `pwd`; chomp $dir; if ($dir =~ m/(^.*)(GQ.*)/) { $path = $1; $lib = $2; }; chop $path; $xpath = "$path/ProjectSpecificInfo/"; print "working from $dir\n"; print "library is $lib\n"; opendir (DIR, ".") || die "cannot read from current dir"; open (XREF, "<$xpath$lib.OUT") || die "cannot open cross-ref file $xpa +th$lib.OUT"; %XREF = (); while (<XREF>) { ($SeqID, $MNid) = (split /\t/, $_)[0,1]; $XREF{$SeqID} = $MNid; }; `phred -pd phd_dir *.gz`; `phran -q $phran_cut phd_dir/*.phd.1`; `mv $dir/phd_dir/*.raw $dir/phd_dir/phran_dir/`; print "raw files moved to phran, now sorting\n"; while ($sequence = readdir (DIR)) { if ( $sequence =~ /MN/ ) { if ( $sequence =~ /(^.*)(\.gz)/ ) { $sequence = $1; }; if (exists $XREF{$sequence}) { $seqID = $XREF{$sequence}; if ($seqID =~ TB) { `mv $dir/phd_dir/phran_dir/$sequence.raw $dir/phd_dir/phra +n_dir/3prime/`; } else { `mv $dir/phd_dir/phran_dir/$sequence.raw $dir/phd_dir/phra +n_dir/5prime/`; }; }; }; }; $threeprimedir = "$dir/phd_dir/phran_dir/3prime"; $fiveprimedir = "$dir/phd_dir/phran_dir/5prime"; @primes = ($threeprimedir, $fiveprimedir); foreach $vfdir ( @primes ) { $pid = fork and next; &VF4($line); print "Completed $dir\n"; }; print "processing complete for $dir\n"; $dir/phd_dir/phran_dir/vf_dir/`; sub finish { chdir "$dir/phd_dir/phran_dir/vf_dir/"; print "Running artifact filter\n"; `af *.seq > afresults.log`; `mv *qc $dir/phd_dir/phran_dir/vf_dir/af_dir/`; `mv afresults.log $dir/phd_dir/phran_dir/vf_dir/af_dir/`; `mv *.af.out $dir/phd_dir/phran_dir/vf_dir/af_dir/`; `mv $dir/phd_dir/phran_dir/vf_dir/*.raw $dir/phd_dir/phran_d +ir/`; }; sub VF4($) { if ($vfdir =~ /3prime/) { chdir "$dir/phd_dir/phran_dir/3prime"; `gstVF4 -o $vffile3 *.raw > 3primevf.log`; print "finished Running VF4 in 3prime\n"; `mv $dir/phd_dir/phran_dir/3prime/* $dir/phd_dir/phran_dir/vf_dir/ +`; chdir ".."; `rm $vfdir`; #&finish; }elsif ($vfdir =~ /5prime/) { chdir "$dir/phd_dir/phran_dir/5prime"; `gstVF4 -o $vffile5 *.raw > 5primevf.log`; print "finished Running VF4 in 5prime\n"; `mv $dir/phd_dir/phran_dir/5prime/* $dir/phd_dir/phran_dir/vf_dir/ +`; chdir ".."; `rm $vfdir`; #&finish; }; &finish; };

Replies are listed 'Best First'.
Re: Forking too much fun...
by bobn (Chaplain) on Jun 25, 2003 at 22:22 UTC
    You're not arranging for your child processes to exit; so they takeup where the parent left off.

    The first time through the foreach loop, you fork an new process. When that process finishes &VF4($line);, it loops around and passes throguh the fork, doing another fork to run &VF4($line); - which the parent process also did. Hence &VF4($line); runs twice for $fiveprimedir

    I think if you exit; after &VF4($line); you get the effect you want. (UNTESTED).

    update: missed the second half of your question, about the main process finishing immediately after the forks.

    use of fork is:
    # UNTESTED for $line (@whatever) { $pid = fork(); # 0 in child, child's pid in parent if ($pid) { # in parent process push @pids, $pid; } else { #child processe &VF4($line); exit; } } for ( @pids ) { waitpid($_,0) } # wait for all children


    --Bob Niederman, http://bob-n.com