hoffmann has asked for the wisdom of the Perl Monks concerning the following question:

I have the following perl script:
#!/usr/bin/perl -w ###################################################################### +########### # + # # Program to submit bootstrap jobs to cluster + # # + # # Usage: qsubbootstrap.pl <output dir> <lower#> <upper#> <ARACNE argum +ents> # # Example: qsubbootstrap.pl adjdir 1 100 -i data.exp -l tflist.txt -s +probe.txt # # -p 1e-5 -e 0 + # # + # # Description: This program submit the bootstrap jobs to cluster and + # # store all the output bootstrapping networks in the directory <outp +ut dir>. # # The directory will be created if it doesn't exist. + # # Users also need to specify the range of bootstrapping samples to s +ubmit. # # + # ###################################################################### +########### use Fcntl; if (scalar(@ARGV) < 4) { die "Incorrect number of arguments!\nUsage: qsubbootstrap.pl <output + dir> <lower#> <upper#> <ARAC arguments> \nExample: qsubbootstrap.pl +adjdir 1 100 -i data.exp -l tflist.txt -p 1e-5 -e 0\n"; } else { ($dir,$r1,$r2,@otherarg)=@ARGV; $otherarg="@otherarg"; } $otherarg =~ m/-i\s+(\S+)\.\S+\s/; $inputfile = $1; $dir=$dir."\/" unless ($dir=~m/\/$/); @folderfound=<./*/>; $allfolder="@folderfound"; system "mkdir ".$dir unless ($allfolder=~m/$dir/); foreach $bs ($r1..$r2){ $bs3 = sprintf("%03i", $bs); $shfile = "aracne".$bs3."\.sh"; sysopen(OUTFILE, $shfile, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!; select OUTFILE; print "\#\$ -S \/bin\/bash -cwd -j y -o soe \n\n \.\/arac $otherarg +-r $bs -o ".$dir.$inputfile."_r".$bs3.".adj"; close OUTFILE; system "qsub -p -100 $shfile"; system "rm ".$shfile; }
I want to first submit all of the data to the cluster and then do a 30+% and 30-% selection of the data on the bootstrapped set. How could I implement this?

Replies are listed 'Best First'.
Re: Bootstrapping Implementation #2
by dragonchild (Archbishop) on Jul 30, 2008 at 17:59 UTC
    What have you tried? How do you think you could solve the problem?

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      Well, I've currently submitted all of the data for bootstrapping as shown in the above code. I'm not sure how I'd choose the upper and lower 30% of data from the bootstrapped set.
        Talk through it. Describe how you would do it on paper. Define it for me. If you can't tell me how to do it, then you definitely cannot tell a computer.

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?