Bioinformatics- Use of uninitialized value

mlsmit10 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, guys. I've posted previously about this particular script, but I've recently run into a new problem with it. When running the script, I receive the following error: Use of uninitialized value $_ in pattern match (m//) at /Users/smithcabinets26/Desktop/RAD/Digester/Improving/Triple.pl line 66.

This is line 66:

my @third_fragments = grep !/$rsite3/, $second_fragments[$i];
[download]

The script takes a genome, cuts it at two restriction sites, excludes fragments containing a third site, and tells me how many fragments are present in particular size ranges. Line 6 works to remove the fragments with the third site present. The script seems to be functioning correctly, but I am not sure why the above error would occur and am worried that it may be affecting my output in some way. Thanks in advance for your help. The entire script is below.

#! /usr/bin/perl -w
# a script to get fragments of a genome based on restriction enzyme
# retrieves the fragments greater than XX bp and less than XX bp
# whole genome in one gzip file (fasta format)!
# calculates the number of fragments you will get per genome for RADse
+q
# perl genomeFragmentor_doubleDigest.pl genome.gz restriction_site1 re
+striction_site2 organism
# perl genomeFragmentor_doubleDigest.pl Anolisgenome.gz CCTGCAGG GGATC
+C Anolis

use strict;

my %genome = fasta_read_gzip_alt($ARGV[0]);
#my %genome = fasta_read_alt($ARGV[0]);
my $rsite1 = $ARGV[1];
my $rsite2 = $ARGV[2];
my $rsite3 = $ARGV[3];

my $totalCount = 0;            # count of all fragments in the genome
my $totalBP = 0;            # the total number of base pairs in the se
+lected fragments

my $organism = $ARGV[4];

my $radtagfile = $organism."_Radseq_fragments_doubleDigest_".$rsite1."
+_".$rsite2."_".$rsite3.".fasta";
my $sizesfile = $organism."_Radseq_fragments_doubleDigest_".$rsite1."_
+".$rsite2."_".$rsite3."_sizes.txt";
open (OUT, ">$radtagfile");
open (SIZE, ">$sizesfile");
print SIZE "length    organism\n";

# prepare a summary file
my ($Second, $Minute, $Hour, $Day, $Month, $Year, $WeekDay, $DayOfYear
+, $IsDST) = localtime(time);
$Month++;
$Year += 1900;
my $date = $Month."_".$Day."_".$Year;
my $summaryfile = $organism."_summary_doubleDigest_".$rsite1."_".$rsit
+e2."_".$rsite3."_".$date.".txt";
open (RESULTS, ">$summaryfile");

my %final_fragments;
my @all_size_fragments;

# start creating the fragments
foreach my $seqname (keys %genome) {
    print "Working on sequence $seqname.\n";
    
    my @first_fragments = split("$rsite1", $genome{$seqname});        
+    # split the fragments up based on the enzyme motif
    
    # add the restriction site motif back onto the fragments
    $first_fragments[0] .= $rsite1;    
    $first_fragments[scalar @first_fragments - 1] = $rsite1.$first_fra
+gments[scalar @first_fragments - 1];     
    
    for (my $i = 0; $i < scalar @first_fragments; ++$i) {
        if ($i != 0 or $i != (scalar @first_fragments - 1)) {
            $first_fragments[$i] = $rsite1.$first_fragments[$i].$rsite
+1;
        }
        
        # now split the fragment using $rsite2
        # repair the first and last fragments to include $rsite2
        # these are the only fragments to contain both restriction sit
+es, so keep them in @final_fragments
        my @second_fragments = split($rsite2, $first_fragments[$i]);
        
        $second_fragments[0] .= $rsite2;
        $second_fragments[scalar @second_fragments - 1] = $rsite2.$sec
+ond_fragments[scalar @second_fragments - 1];
        foreach my $fragment (@second_fragments) {
            push(@all_size_fragments, length($fragment));
            
            }
            
        my @third_fragments = grep !/$rsite3/, $second_fragments[$i];
        
        
        $third_fragments[0] .= $rsite3;
        $third_fragments[scalar @third_fragments - 1] = $rsite3.$third
+_fragments[scalar @third_fragments - 1];
        foreach my $fragment (@third_fragments) {
            push(@all_size_fragments, length($fragment));
            }
        my $final_fragment1 = $seqname."_".$i."_1";     
        my $final_fragment2 = $seqname."_".$i."_2";    
        $final_fragments{$final_fragment1} = $third_fragments[0];
        $final_fragments{$final_fragment2} = $third_fragments[scalar @
+third_fragments - 1];
    }
    
}

# keep a score of how many fragments fall within a particular size ran
+ge

my $size_100_150 = 0;
my $size_151_200 = 0;    
my $size_201_250 = 0;    
my $size_251_300 = 0;    
my $size_301_350 = 0;    
my $size_351_400 = 0;    
my $size_401_450 = 0;    
my $size_451_500 = 0;    
my $size_501_550 = 0;    
my $size_551_600 = 0;
my $size_small = 0;
my $size_large = 0;    

    
foreach my $fragment (keys %final_fragments) {        
    # add on $rsite1 to both sides of the fragment
    my $fragmentLength = length($final_fragments{$fragment});
    print OUT ">$fragment", "_", "1\n";
    print OUT substr($final_fragments{$fragment}, 0, 96), "\n";
    print OUT ">$fragment", "_", "2rc\n";
    print OUT revcom(substr($final_fragments{$fragment}, $fragmentLeng
+th - 96, 96)), "\n";
    $totalBP += $fragmentLength;
    
    if ($fragmentLength >= 100 and $fragmentLength <= 150) {
        ++$size_100_150;
    } elsif ($fragmentLength > 150 and $fragmentLength <= 200) {
        ++$size_151_200;
    } elsif ($fragmentLength > 200 and $fragmentLength <= 250) {
        ++$size_201_250;
    } elsif ($fragmentLength > 250 and $fragmentLength <= 300) {
        ++$size_251_300;
    } elsif ($fragmentLength > 300 and $fragmentLength <= 350) {
        ++$size_301_350;
    } elsif ($fragmentLength > 350 and $fragmentLength <= 400) {
        ++$size_351_400;
    } elsif ($fragmentLength > 400 and $fragmentLength <= 450) {
        ++$size_401_450;
    } elsif ($fragmentLength > 450 and $fragmentLength <= 500) {
        ++$size_451_500;
    } elsif ($fragmentLength > 500 and $fragmentLength <= 550) {
        ++$size_501_550;
    } elsif ($fragmentLength > 550 and $fragmentLength <= 600) {
        ++$size_551_600;
    } elsif ($fragmentLength < 100) {
        ++$size_small;
    } elsif ($fragmentLength > 600) {
        ++$size_large;
    }
}
    
$totalCount = scalar keys %final_fragments;            # count of all 
+fragments in the genome
print RESULTS "The restriction sites used were:\n";
print RESULTS $ARGV[1], "\n";
print RESULTS $ARGV[2], "\n\n";

print RESULTS "There were ", $totalCount, " fragments from the whole g
+enome.\n";

print RESULTS "There were ", $size_100_150, " fragments between 100 an
+d 150 bp.\n";
print RESULTS "There were ", $size_151_200, " fragments between 151 an
+d 200 bp.\n";
print RESULTS "There were ", $size_201_250, " fragments between 201 an
+d 250 bp.\n";
print RESULTS "There were ", $size_251_300, " fragments between 251 an
+d 300 bp.\n";
print RESULTS "There were ", $size_301_350, " fragments between 301 an
+d 350 bp.\n";
print RESULTS "There were ", $size_351_400, " fragments between 351 an
+d 400 bp.\n";
print RESULTS "There were ", $size_401_450, " fragments between 401 an
+d 450 bp.\n";
print RESULTS "There were ", $size_451_500, " fragments between 451 an
+d 500 bp.\n";
print RESULTS "There were ", $size_501_550, " fragments between 501 an
+d 550 bp.\n";
print RESULTS "There were ", $size_551_600, " fragments between 551 an
+d 600 bp.\n";
print RESULTS "There were ", $size_small, " fragments smaller than 100
+ bp.\n";
print RESULTS "There were ", $size_large, " fragments larger than 600 
+bp.\n\n";

print RESULTS "There are ", $totalBP, " base pairs in the fragments.\n
+";

print RESULTS "\nJust some notes of reference:\n";
print RESULTS "HiSeq 2000 gives 375,000,000 reads.\n"; 
print RESULTS "HiSeq 2500 gives 742,000,000 reads.\n"; 

close OUT;
close RESULTS;

foreach my $length (@all_size_fragments) {
    print SIZE $length, "\t", $organism, "\n";
}
exit;

sub fasta_read_gzip_alt {
    
    # reads in a gzip fasta file and pases it into a hash
    # version 1.0
    
    (my $filename) = @_;        # be sure to include the path
    my %fasta;
    open(FASTA, "gunzip -c $filename |") || die "can't open pipe to $f
+ilename";
    my $fastaData;
    my $sequence = '';
    my $name = '';
    while(<FASTA>) {
        $fastaData = $_;
        $fastaData =~ s/\n//gms;
        if ($fastaData =~ />/) {
            if ($sequence) {                # if there is a sequence, 
+then the sequence belongs to the last name
                $fasta{$name} = $sequence;
            }
            
            # reinitialize everything
            $sequence = '';                # start over!
            $name = $fastaData;
            $name =~ s/>//gms;
        } elsif (eof FASTA) {
            $fasta{$name} = $sequence;
        } else {
            $sequence .= $fastaData;
        }
    }
    close FASTA;
    
    return %fasta;

}

sub revcom {
    (my $sequence) = @_;
    $sequence = reverse($sequence);
    $sequence =~ tr/AGCTRYMKSWHBVDNagctrymkswhbvdn/TCGAYRKMSWDVBHNtcga
+yrkmswdvbhn/;
    return $sequence;
}

sub fasta_read_alt {
    
    # reads in a fasta file and pases it into a hash
    # version 1.0
    
    (my $filename) = @_;        # be sure to include the path
    my %fasta;
    open(FASTA, $filename);
    my $fastaData;
    my $sequence = '';
    my $name = '';
    while(<FASTA>) {
        $fastaData = $_;
        $fastaData =~ s/\n//gms;
        if ($fastaData =~ />/) {
            if ($sequence) {                # if there is a sequence, 
+then the sequence belongs to the last name
                $fasta{$name} = $sequence;
            }
            
            # reinitialize everything
            $sequence = '';                # start over!
            $name = $fastaData;
            $name =~ s/>//gms;
        } elsif (eof FASTA) {
            $fasta{$name} = $sequence;
        } else {
            $sequence .= $fastaData;
        }
    }
    close FASTA;
    return %fasta;

}
[download]

Comment on Bioinformatics- Use of uninitialized value Select or Download Code

Replies are listed 'Best First'.
Re: Bioinformatics- Use of uninitialized value by kennethk (Abbot) on Jul 21, 2014 at 21:41 UTC
I don't think you are doing what you intend to. `my @third_fragments = grep !/$rsite3/, $second_fragments[$i];` [download] only checks the i'th element of the @second_fragments array, and `$i` is associated with `@first_fragments`. Perhaps you meant `my @third_fragments = grep !/$rsite3/, @second_fragments;` [download] Also, you should be aware that the vast majority of your scalar invocations are unnecessary. `<`, `-` and most other operators enforce scalar context already. And you can count from the end of an array using negative indices, thus you could have written e.g. `$first_fragments[-1] = $rsite1.$first_fragments[-1];` [download] #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply] [d/l] [select]
Re^2: Bioinformatics- Use of uninitialized value by mlsmit10 (Initiate) on Jul 21, 2014 at 21:51 UTC
I tried making the change you suggested, and the results don't make any sense. It gives me a number that is higher than the number I get if I run the same script without the grep function (and associated modifications) included. Should I define something similar to the $i defined for the first fragments for the second fragments? Thanks.	[reply]
Re: Bioinformatics- Use of uninitialized value by johngg (Canon) on Jul 21, 2014 at 22:45 UTC
I'm not a Bio guy either so just general thoughts from me as well. Consider using `strftime()` from the core POSIX module to construct your date and perhaps use join rather than concatenation to form filenames. `use strict; use warnings; use POSIX qw{ strftime }; my $organism = q{WildHaggis}; my $rsite1 = q{ONE}; my $rsite2 = q{TWO}; my $rsite3 = q{THREE}; my $summaryfile = join q{_}, $organism, q{summary_doubleDigest}, $rsite1, $rsite2, $rsite3, strftime q{%m_%d_%Y.txt}, localtime( time() ); print $summaryfile, qq{\n};` [download] This prints `WildHaggis_summary_doubleDigest_ONE_TWO_THREE_07_21_2014.txt` [download] It is considered best practice to use lexical filehandles and the three-argument form of open and to test for success, showing the o/s error on failure, e.g. `open my $outFH, q{>}, $radtagfile or die qq{open: > $radtagfile: $!\n};` [download] Consider using Perl-style loops rather than C-style ones. Instead of `for (my $i = 0; $i < scalar @first_fragments; ++$i)` [download] you can write `foreach my $i ( 0 .. $#first_fragments )` [download] I hope these pointers are helpful. Cheers, JohnGG	[reply] [d/l] [select]
Re: Bioinformatics- Use of uninitialized value by GrandFather (Saint) on Jul 22, 2014 at 00:03 UTC
Whenever you find yourself copying and pasting code, or generating a bunch of lines that look similar either use a sub or consider using an appropriate data structure. The following rework of your code checks opens, uses the three parameter version of open, uses lexical file handles, "fixes" the grep of a single element issue, cleans up the binning code (do you really want unequal bin sizes?) and adds some test data along with a file reading variant that uses it. use strict; use warnings; my ($fastaFile, $rsite1, $rsite2, $rsite3, $organism) = @ARGV; my %genome = fastaReadData(); my $radtagfile = join '_', $organism, "Radseq_fragments_doubleDigest", $rsite1, $rsite2, "${rsite3}fasta"; my $sizesfile = join '_', $organism, "Radseq_fragments_doubleDigest", $rsite1, $rsite2, $rsite3, "sizes.txt"; open my $fOut, '>', $radtagfile or die "Can't create $radtagfile: $!\ +n"; open my $fSize, '>', $sizesfile or die "Can't create $sizesfile: $!\n +"; print $fSize "length\torganism\n"; # prepare a summary file my ($Second, $Minute, $Hour, $Day, $Month, $Year) = localtime(time); $Month++; $Year += 1900; my $date = join '_', $Month, $Day, $Year; my $summaryfile = join '_', $organism, "summary_doubleDigest", $rsite1 +, $rsite2, $rsite3, "${date}txt"; open my $fRes, '>', $summaryfile or die "Can't create $summaryfile: $! +\n"; my %final_fragments; my @all_size_fragments; # start creating the fragments foreach my $seqname (keys %genome) { print "Working on sequence $seqname.\n"; # split the fragments up based on the enzyme motif my @first_fragments = split $rsite1, $genome{$seqname}; # add the restriction site motif back onto the fragments $first_fragments[0] .= $rsite1; $first_fragments[-1] = "$rsite1$first_fragments[-1]"; for my $i (0 .. $#first_fragments) { if ($i or $i != $#first_fragments) { $first_fragments[$i] = "$rsite1$first_fragments[$i]$rsite1 +"; } # now split the fragment using $rsite2 # repair the first and last fragments to include $rsite2 # these are the only fragments to contain both restriction sit +es, so # keep them in @final_fragments my @second_fragments = split $rsite2, $first_fragments[$i]; $second_fragments[0] .= $rsite2; $second_fragments[-1] = "$rsite2$second_fragments[-1]"; push @all_size_fragments, length $_ for @second_fragments; my @third_fragments = grep !/$rsite3/, @second_fragments; $third_fragments[0] .= $rsite3; $third_fragments[-1] = "$rsite3$third_fragments[-1]"; push @all_size_fragments, length $_ for @third_fragments; $final_fragments{join '_', $seqname, $i, 1} = $third_fragments +[0]; $final_fragments{join '_', $seqname, $i, 2} = $third_fragments +[-1]; } } # keep a score of how many fragments fall within a particular size ran +ge my %counts; my @bins = (('smaller than 100') x 2, 'between 100 and 150'); push @bins, map {'between ' . ($_ * 50 + 1) . ' and ' . ($_ + 1) * 50} 3 .. 12 +; push @bins, 'larger than 600'; $counts{$_} = 0 for @bins; my $totalBP = 0; # the total number of base pairs in the selected fra +gments foreach my $fragment (keys %final_fragments) { # add on $rsite1 to both sides of the fragment my $fragmentLength = length($final_fragments{$fragment}); print $fOut ">${fragment}_1\n"; print $fOut substr($final_fragments{$fragment}, 0, 96), "\n"; print $fOut ">${fragment}_2rc\n"; print $fOut revcom( substr($final_fragments{$fragment}, $fragmentLength - 96, 96) ), "\n"; $totalBP += $fragmentLength; my $binnedSize = $fragmentLength / 50; $binnedSize = 2 if $binnedSize < 2 && $fragmentLength >= 100; if ($binnedSize >= $#bins) { ++$counts{$bins[-1]}; } else { ++$counts{$bins[$binnedSize]}; } } my $totalCount = keys %final_fragments; print $fRes "The restriction sites used were:\n"; print $fRes "$rsite1\n"; print $fRes "$rsite2\n\n"; print $fRes "There were ", $totalCount, " fragments from the whole genome.\n"; print $fRes "There were $counts{$_} fragments $_ bp.\n" for @bins[1 .. $#bins]; print $fRes "There are ", $totalBP, " base pairs in the fragments.\n"; print $fRes "\nJust some notes of reference:\n"; print $fRes "HiSeq 2000 gives 375,000,000 reads.\n"; print $fRes "HiSeq 2500 gives 742,000,000 reads.\n"; close $fOut; close $fRes; print $fSize "$_\t$organism\n" for @all_size_fragments; sub fastaReadData { return parseFile(*DATA); } sub fasta_read_gzip_alt { my ($filename) = @_; # be sure to include the path my %fasta; open my $fIn, "gunzip -c $filename \|" or die "can't open pipe to $filename: $!"; return parseFile($fIn); } sub fasta_read_alt { my ($filename) = @_; # be sure to include the path open my $fIn, '<', $filename or die "can't open $filename: $!\n"; return parseFile($fIn); } sub parseFile { my ($fIn) = @_; my $fastaData; my $sequence = ''; my $name = ''; my %fasta; while (<$fIn>) { $fastaData = $_; $fastaData =~ s/\n//gms; if ($fastaData =~ />/) { if ($sequence) { # if there is a sequence, then the sequence belongs to + the last # name $fasta{$name} = $sequence; } # reinitialize everything $sequence = ''; # start over! $name = $fastaData; $name =~ s/>//gms; } elsif (eof $fIn) { $fasta{$name} = $sequence; } else { $sequence .= $fastaData; } } return %fasta; } sub revcom { (my $sequence) = @_; $sequence = reverse($sequence); $sequence =~ tr/AGCTRYMKSWHBVDNagctrymkswhbvdn/TCGAYRKMSWDVBHNtcgayrkmswdvb +hn/; return $sequence; } __DATA__ >24.6jsd1.Tut TTGGAGAGTTTGATCCTGGCTCAGGATGAACGCTGGCGGCGTGCCTAATA CATGCAAGTCGAGCGAATGGATTAAGAGCTTGCTCTTATGAAGTTAGCGG CGGACGGGTGAGTAACACGTGGGTAACCTGCCCATAAGACTGGGATAACT CCGGGAAACCGGGGCTAATACCGGATAACATTTTGAACCGCATGGTTCGA AATTGAAAGGCGGCTTCGGTCGTCACTTATGGATGGACCCGCGTCGCATT AGCTAGTTGGTGAGGTAACGGCTCACCAAGGCAACGATGCGTAGCCGACC TGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTAC GGGAGGCAGCAGTAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGC AACGCCGCGTGAGTGATGAAGGCTTTCGGGTCGTAAAACTCTGTTGTTAG GGAAGAACAAGTGCTAGTTGAATAAGCTGGCACCTTGACGGTACCTAACC AGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGG CAAGCGTTATCCGGAATTATTGGGCGTAAAGAACGCGCAGGTGGTTTCTT AAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACT GGGAGACTTGAGTGCAGAAGAGGAAAGTGGAATTCCATGTGTAGCGGTGA AATGCGTAGAGATATGGAGGAACACCAGTGGCCCAGGCGACTTTCTGGTC TGTAACTGACACTGAGGCGCGAAAGCGTGGGGAGCAAACAGGATTAGATA CCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGAGGGTTTC CGCCCTTTAGTGCTGAAGTTAAAGCATTAAGCACTCCGCGTGTGGAGTAC GGCCGCAAGGCTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGT GGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGTCTTG ACATCCTCTGACAACCCTAGAGATAGGGCTTCTCCTTCGGGAGCAGAGTG ACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAA GTCCCGCAACGAGCGCAACCCTTGATCTTAGTTGCCATCATTWAGTTGGG CACTCTAAGGTGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGT CAAATCATCATGCCCCTTATGACCTGGGCTACACACGTGCTACAATGGAC GGTACAAAGAGCTGCAAGACCGCGAGGTGGAGCTAATCTCATAAAACCGT TCTCAGTTCGGATTGTAGGCTGCAACTCGCCTACATGAAGCTGGAATCGC TAGTAATCGCGGATCAGCATGCCGCGGTGAATACGTTCCCGGGCCTTGTA CACACCGCCCGTCACACCACGAGAGTTTGTAACACCCGAAGTCGGTGGGG TAACCTTTTTGGAGCCAGCCGCCTAAGGTGGGACAGATGATTGGGGTGAA GTCGTAACAAGGTAGCCGTATCGGAAGGTGCGGCTGGATCACCTCCTTTC T [download] The summary file contains: The restriction sites used were: TTGG CTAG There were 18 fragments from the whole genome. There were 14 fragments smaller than 100 bp. There were 3 fragments between 100 and 150 bp. There were 0 fragments between 151 and 200 bp. There were 1 fragments between 201 and 250 bp. There were 0 fragments between 251 and 300 bp. There were 0 fragments between 301 and 350 bp. There were 0 fragments between 351 and 400 bp. There were 0 fragments between 401 and 450 bp. There were 0 fragments between 451 and 500 bp. There were 0 fragments between 501 and 550 bp. There were 0 fragments between 551 and 600 bp. There were 0 fragments between 601 and 650 bp. There were 0 fragments larger than 600 bp. There are 1121 base pairs in the fragments. Just some notes of reference: HiSeq 2000 gives 375,000,000 reads. HiSeq 2500 gives 742,000,000 reads. [download] and no errors or warnings are raised. Perl is the programming world's equivalent of English	[reply] [d/l] [select]
Re^2: Bioinformatics- Use of uninitialized value by xyzzy (Pilgrim) on Jul 22, 2014 at 03:26 UTC
This is in the original script as well, but since this version is prettier to look at, I'll comment here. sub parseFile { my ($fIn) = @_; # a filehandle my $fastaData; ... #yada yada yada while (<$fIn>) { # $/ has not been undefined, $fIn is read line by + line $fastaData = $_; $fastaData =~ s/\n//gms; #/gms are all useless because: #g - each time $fastaData will have e +xactly one \n (at the end) #m - regex doesn't use ^ or $, and it +'s always a single line anyways #s - regex doesn't use . and it's alw +ays a single line anyways # effectively performs the same function as # chomp($fastaData); # or even # chomp($fastaData = $_); ... } ... } [download] Has nothing to do with the error but if we're on the subject of cleaning up the code, this is something that made me do a double-take. On a less related note, GMS is also the name of a Dutch trance duo, whose three-hour live performance at a festival I attended was a life-changing experience, which is why that particular combination of letters always brings back the good vibes `:^)` `$,=qq.\n.;print q.\/\/____\/.,q./\ \ / / \\.,q. /_/__.,q..` Happy, sober, smart: pick two.	[reply] [d/l] [select]
Re^3: Bioinformatics- Use of uninitialized value by GrandFather (Saint) on Jul 22, 2014 at 05:11 UTC
I was getting tired by the time I got to refactoring that code and didn't actually look at it critically at all :(. Good catch. There are plenty more I sure! Perl is the programming world's equivalent of English	[reply]
Re: Bioinformatics- Use of uninitialized value by AnomalousMonk (Archbishop) on Jul 21, 2014 at 22:06 UTC
I'm not a bio-guy, so just some general thoughts. ... error: Use of uninitialized value $_ in pattern match (m//) at /Users/.../Triple.pl line 66. This is not an error, but a warning. Warnings are enabled by use-ing the warnings pragma, which the script you posted does not do. I don't see how you could get the quoted warning from the posted code (update: unless you're using the `-w` command-line switch to invoke your script — Ah-ha, I see it now in the shebang invocation: strike this comment). `c:\@Work\Perl\monks>perl -MData::Dump -le "use strict; ;; my $rx = 'x'; my @ra = grep !/$rx/, undef; dd \@ra; " [undef] c:\@Work\Perl\monks>perl -MData::Dump -le "use strict; use warnings; ;; my $rx = 'x'; my @ra = grep !/$rx/, undef; dd \@ra; " Use of uninitialized value $_ in pattern match (m//) at -e line 1. [undef]` [download] `my @third_fragments = grep !/$rsite3/, $second_fragments[$i];` grep operates on a list, but this statement has a list consisting only of the `$second_fragments[$i]` scalar, i.e., a single item. The `@third_fragments` array can then be initialized with only zero or one elements. I don't know if this was intended. Further, I don't offhand see any reason the `@second_fragments` array must contain `$i + 1` elements; thus, you may be accessing an element that's "off the end" of the array and therefore undefined. Update: Finally saw `-w` in shebang invocation, so strike entire first 'thought'.	[reply] [d/l] [select]