gennari has asked for the wisdom of the Perl Monks concerning the following question:

This script is called by the mail DB backup script. I'm trying to get through "copy, verify, zip, move" as quickly as possible by using child processes. Also trying to allow for not having enough room to just copy all, verify all, zip all, move all. I think I just barely have a hold on how forking works, but something keeps going wrong. I've got two children trying to work on the same file (as well as other errors). Anyone want to give me some direction?
#!/usr/local/bin/perl -w # This does all of the copying, verifying, zipping, and moving # of backup files. # # ToDo: # 1. use @ARGV to hand in list of data file paths, dop, blocksize, etc +. # use POSIX ":sys_wait_h"; use DBSystemUtils; use File::Basename; ##################################################### # All this will be passed in from the main program # if I can - much of what's here is just to make this # program run on its own while developing it $children = 0; $dop = 3; # degree of parallelism $blocksize = 8192; @dbFilePaths = ("/export/home/oracle/ORIG/file01.dbf", "/export/home/oracle/ORIG/file02.dbf", "/export/home/oracle/ORIG/system01.dbf"); @dbFileNames = basename(@dbFilePaths); #$origDir = dirname($dbFilePaths[0]); $localDir = "/export/home/oracle/LOCAL"; $destDir = "/export/home/oracle/DEST"; %fileSizes = ("/export/home/oracle/ORIG/file01.dbf"=>83894272, "/export/home/oracle/ORIG/file02.dbf"=>83894272, "/export/home/oracle/ORIG/system01.dbf"=>83894272); #$totalSize = 0; foreach $path (@dbFilePaths) { $file = basename($path); } %fileStatus = map {$_, "0"} @dbFileNames; #%statusCodes = (0=>"orig", # 1=>"copying", # 2=>"copied", # 3=>"verifying", # 4=>"verified", # 5=>"zipping", # 6=>"zipped", # 7=>"moving", # 8=>"moved", # 999=>"error"); $availSpace = DBSystemUtils->spaceAvailable($localDir); # make space tight for testing purposes $availSpace = $availSpace - 439438000; ###################################################### while (@dbFilePaths) { $file = shift(@dbFilePaths); # if there's room for the file in localDir, copy it there if ($fileSizes{$file} < $availSpace) { :184 push (@zippedFiles, $zippedFile); } $fileToWorkOn = smallest($localDir, @zippedFiles); $command = "mv $fileToWorkOn $destDir"; } elsif (@verifiedFiles) { print "Have verified files.\n"; foreach $path (@verifiedFilePaths) { push (@verifiedFiles, basename($path)); } $fileToWorkOn = smallest($localDir, @verifiedFiles); $command = "gzip $fileToWorkOn"; } elsif (@copiedFiles) { print "Have copied files.\n"; foreach $path (@copiedFilePaths) { push (@copiedFiles, basename($path)); } $fileToWorkOn = smallest($localDir, @copiedFiles); $command = "dbv file=$fileToWorkOn blocksize=$blocksize"; } else { print "Problem\n"; next; } } forkIt($command); } sub oneStatus { my $stat = shift; my $hash_ref = shift; @subList = (); foreach $key (keys %{$hash_ref}) { push (@subList, $key) if $$hash_ref{$key} == $stat; } return @subList; } sub smallest { my $dir = shift; my $arr_ref = shift; my $smallFileListing = ""; my @dirListing = (); $smallFile = ""; @dirListing = `ls -l @$arr_ref | sort +4nr`; chomp(@dirListing); $smallFileListing = $dirListing[0]; $smallFile = (split /\//, $smallFileListing)[-1]; $smallFilePath = $dir . "/" . $smallFile; return $smallFilePath; }

Replies are listed 'Best First'.
Re: fork and reap
by goldclaw (Scribe) on Mar 14, 2001 at 04:31 UTC
    OK, hard to say whats going wrong when you don't include forkit. Anyway, lets see if we can find some improvements. The only thing I can really make out is smallest. No need to use ls, as it might confuse your forkIt(if you use wait). Try this instead:
    smallest{ my ($dir,$arr_ref)=@_; my %sc; #Cache of file sizes; return +(sort {$sc{$a}||=-s $a; $sc{$b}||=-s $b; $sc{$a} <=> $sc{$b} } map { "$dir/$_" } @$arr_ref)[0]; }
    That should be efficent enough.

    As for the rest, I don't understand what you try to do. You have one main loop. The first thing you do, is to shift a file off  @dbFilePaths. That means it will only go through that loop once for all those filepaths. Inside the loop is one big if/elsif block. So, for each of those FilePath's, one of those blocks will be executed, and finally you fork of, presumably to execute the command you set in the if/elsif block

    Furthemore, since you never set @copiedFiles or @verifiedFiles, 2 of those blocks will never get executed. I think you should try to post a version that you feel is somewhat correct, or failing that, put in some comments on what you think is going on in the code

    GoldClaw

      Dunno why forkIt fell off the bottom. You're exactly right about the flow of the program. ForkIt is supposed to send a child off to run the command.

      Thanx for your sorting sub. I'll plug that in. It didn't even occur to me that the ls call would show up as a forked thing and throw off the rest of the tests. Many thanx for pointing that out.

      I've been thinking about moving to a system that doesn't use the copied, zipped, and verified lists, since I have a status hash.

      Gonna go implement your excellent suggestions and some of the other ideas I've come up with.

      Thanx.