comment on

I have a script that essentially extracts DNA sequences at specific locations and piles them up to calculate the frequency of each letter (A/G/C/T):

use strict;
use warnings;
use Bio::SeqIO;
use Getopt::Long;
use File::Basename;
use List::Util qw(all);
my $scriptname = basename($0);    #Obtain script-name
my $outext = '.txt';    #Output .file-extension
my ($peaks, $fasta, $width, $mode, $exclusion, %list, $chr, $min, $max
+, %sequences);
my $usage = "Usage: $scriptname -i <HistogramFile> -r <ReferenceFASTA>
+ -w <Width> -m <Mode: Pos/Freq> -e <ExclusionList (Optional)>";      
+  #Error/usage message
GetOptions('i=s' => \$peaks,
              'r=s' => \$fasta,
              'e=s' => \$exclusion,
              'w=i' => \$width,
              'm=s' => \$mode) or die("\n$usage\n");
die("\nError: Arguments or -flags are missing and/or incorrectly speci
+fied.\n\n$usage\n\n") unless all {defined} $peaks, $fasta, $width, $m
+ode;
my @files = glob($peaks."/*.txt");
my $chk = scalar(@files);
print "\nFailed to detect any valid files within the current directory
+.\n\n" if $chk == 0;
exit if $chk == 0;
if (defined $exclusion) {
    open my $IN, '<', $exclusion or die "$!";
    while (<$IN>) {
        chomp $_;
        ($chr, $min, $max) = split('\s+', $_);
        push @{$list{$chr}}, [$min,$max];
    }
}
my $seqio = Bio::SeqIO->new(-file => $fasta);
while(my $seqobj = $seqio->next_seq) {
    my $id  = $seqobj->display_id;
    my $seq = $seqobj->seq;
    $sequences{$id} = $seq;
}
for my $file (@files) {
    open my $IN2, '<', $file or die "$!";
    (my $sample = basename($file)) =~ s/.txt//;
    my $dir = dirname($file);
    my $outfile = $dir."/".$sample."_SeqBias_"."$mode".$outext;
    open my $OUT, '>', $outfile or die "$!";
    my (@flankseq, $seqEX);
    sub seqstore {
        my @rcd = @_;
        if ($mode eq "Pos") {
            push (@flankseq, $rcd[0]);
        } elsif ($mode eq "Freq") {
            push (@flankseq, $rcd[0]) for (1..$rcd[1]);
        }
        return;
    }
    <$IN2> for (1..1);
    while (<$IN2>) {
        chomp $_;
        my(@F) = split("\t", $_);
        next if grep {$F[1] >= $_->[0] && $F[1] <= $_->[1]} @{ $list{$
+F[0]}};
        next if $F[1]-$width < 1;
        if ($F[3] > 0) {
            $seqEX = substr($sequences{$F[0]},$F[1]-1-$width,$width) #
+Extracts Xbp upstream (not incl. 5' end)
            .substr($sequences{$F[0]},$F[1]-1,$width+1);
            $seqEX =~ tr/GATC/CTAG/;
            $seqEX = reverse($seqEX);
            seqstore($seqEX,$F[3]);
        }
        if ($F[2] > 0) {
            $seqEX = substr($sequences{$F[0]},$F[1]-1-$width,$width) #
+Extracts Xbp upstream (not incl. 5' end)
            .substr($sequences{$F[0]},$F[1]-1,$width+1);
            seqstore($seqEX,$F[2]);
        }
    }
    my %freq;
    my $inc = 1 / @flankseq;
    for my $FS (@flankseq) {
        $freq{substr $FS, $_, 1}[$_] += $inc for 0..length($FS)-1;
    }
    printf($OUT "%s\t"."%s\t" x ((keys %freq)-1)."%s"."\n", "Pos",sort
+ keys %freq);
    my $label = -$width-1;
    for my $pos (1..length $flankseq[0]) {
        $label++;
        printf($OUT "%d\t"."%7.4f\t" x ((keys %freq)-1)."%7.4f"."\n", 
+$label,map{$freq{$_}[$pos-1] // 0} sort keys %freq);
    }
}
my $run_time = time() - $^T;
print "\n-------------------------------------";
print "\nAnalysis Complete\n";
print "Execution Time: $run_time Seconds\n";
print "-------------------------------------\n\n";
[download]

I've recently tried to add batch processing into the script so it can handle more than 1 input file at a time. However, i've run into a problem that I cannot figure out:

Illegal division by zero at line 80, <$IN2>

Line 80 is:

my $inc = 1 / @flankseq;

When handling the first file, @flankseq is populated as expected (I push the extracted sequences into an array).

However when it comes to opening the second file - which is an identical copy of the first - and repeating the process, @flankseq remains empty leading to the error. Everything else seems to work up until that point and all variables such as $seqEX look as expected.

I have no idea why! Any suggestions?

In reply to Array ending up empty when processing more than a single file by TJCooper

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.