in reply to Use of uninitialized value in string eq

Last time when I had an error of uninitialized value, I was able to fix it by simply initializing the variable.

Don't do that! Or at least, don't do that unless you understand why the variable was uninitialised and providing a default it the correct solution. Just initialising a variable to mask a warning is not fixing the bug in your code.

In a similar way, declaring all your variables up front in a block just to satisfy strict negates the virtue of using strict. In your sample code for example you use $line in a loop, but declare it outside the loop - that's bad. You declare $data then initialise it in the following line. Don't! Declare it where you initialise it.

$gi highlights the problem. It is declared up from with everything else so that strips any meaning that could be inferred from its scope. However the way it is used implies that $gi retains state across iterations of the while loop, but no check is made in the loop to see that $gi has a valid value in the $info{$gi} = $line assignment. An undef check before the assignment and die indicating a badly formatted file may save a lot of grief at some point. BTW, should that assignment perhaps be a push: @$info{$gi}, $line and the array assignments later should then be @temp = @$info{$humangi} and @value = @$info{$gi}?

I notice too that you don't chomp lines in the INFILE2 loop. Is that by design?

True laziness is hard work

Replies are listed 'Best First'.
Re^2: Use of uninitialized value in string eq
by sophix (Sexton) on Apr 23, 2010 at 13:02 UTC
    Thank you, GrandFather. Particularly I appreciate your advise regarding the initialization of variables. Plus, I have not heard the hash slice before, that was very helpful too!

    chomp @INFILE2 loop. You are right, I should have chomped the lines. Because I get that file by using another script where I add \n at the end. Thanks for pointing it out.

    I revised the code in line with your suggestions, and get the following errors:

    Possible unintended interpolation of @VariantList in string at PD.pl line 91.

    I have no idea why I am getting this error. I thought I can safely use an array to print it out in this way.

    Type of arg 1 to push must be array (not hash slice) at PD.pl line 33, near "$line;"

    This is weird?!

    Global symbol "@VariantList" requires explicit package name at PD.pl l +ine 91. Execution of PD.pl aborted due to compilation errors.

    And here is the code. #!/usr/bin/perl -w use strict; use Data::Dumper; my %info = (); my $humangi; my $data = '/DATA/proteinfile.txt'; open INFILE, '<', $data or die "Failed at opening $data!\n"; # Construct the hash with GIs as keys and sequences as values while ( <INFILE> ) { my $line = $_; chomp($line); last if m!END!; if($line=~m/HUMAN/){ ($humangi) = ($line=~m/^\S+\|(\d+)/) } if($line=~m/^\S+\|(\d+)/) { if(defined($1)) { my $gi=$1; } } else { if (defined(my $gi)) { push (@info{$gi}, $line); } else { die "Badly formatted file. Failed at reading the GI!\n"; } } } #print Dumper (\%info); print "$humangi\n"; close(INFILE); my $data2 = '/DATA/variantlist.txt'; open INFILE2, '<', $data2 or die "Failed at opening $data2!\n"; my $data3 = '/DATA/VariantOutput.txt'; open OUTFILE, '>', $data3 or die "Failed at opening $data3!\n"; while ( <INFILE2>){ # Grab a variant from the file (in this example: P82L) my $line2 = $_; chomp($line2); my $Variant = $line2; # Split the variant into three parts my ($source, $position, $sink) = split(/(\d+)(\w)/, $Variant); print "$source , $ position , $sink\n"; # Check whether HS has the source (i.e., P) at the given position (i.e +., 82) my @temp = @info{$humangi}; if ( $temp[$position] eq $source) { print "Yep, $source has been confirmed!\n"; } else { print "There is something wrong!\n"; } # Scan the rest of the sequences to check what amino acid they have at + the given position for my $gi ( keys %info ) { my @value = @info{$gi}; my @VariantList = (); push ( @VariantList, $value[$position]); if ($value[$position] eq $sink){ # Note the cases where we obs +erve the sink (i.e., L) at this position print OUTFILE "A pathogenic deviation has been found at si +te $position - from $source to $sink !\n" . " And the corresponding g +i for this deviation is: $gi\n"; } } print OUTFILE "Variant list contains: @VariantList\n"; } close(INFILE2);

      Oops, my bad. I should have used the hash slice correctly.

      Changing  push (@info{$gi}, $line) to  push (@{$info{$gi}} solved the problem.

      So I am left with the error involving @VariantList. Probably I am making another fundamental mistake when assigning, e.g., trying to assign an array into a scalar. I am looking into it.

        Victory! I made the code work properly. Nevertheless, I do probably have efficiency problems. I would appreciate if you can comment on points where I can improve my code (in terms of both performance and appropriate style). Thanks guys, I could not do it without your help!

        Here is the final ugly code. #!/usr/bin/perl -w use strict; use Data::Dumper; my %info = (); my ($gi, $humangi, $accession); my $data = '/DATA/proteinfile.txt'; open INFILE, '<', $data or die "Failed at opening $data!\n"; # Construct the hash with GIs as keys and sequences as values while ( <INFILE> ) { my $line = $_; chomp($line); last if m!END!; if($line=~m/HUMAN/){ ($humangi) = ($line=~m/^\S+\|(\d+)/); ($accession) = ($line=~m/^\S+\|\d+\|\w+\|(\S{6}?)/); } if($line=~m/^\S+\|(\d+)/) { if(defined($1)) { $gi=$1; } } else { $info{$gi} = $line; } } #print Dumper (\%info); print "$humangi\n"; print "$accession\n"; close(INFILE); my $data2 = '/DATA/variantlist.txt'; open INFILE2, '<', $data2 or die "Failed at opening $data2!\n"; my $data3 = '/DATA/VariantOutput.txt'; open OUTFILE, '>', $data3 or die "Failed at opening $data3!\n"; print OUTFILE "This is [GI: $humangi] and [Accession: $accession]\nVAR +IANT\t\tPOTENTIAL\t\tPD\n"; while ( <INFILE2>){ # Grab a variant from the file (in this example: P82L) my $line2 = $_; chomp($line2); my $Variant = $line2; # Split the variant into three parts my ($source, $position, $sink) = split(/(\d+)(\w)/, $Variant); #print "$source\t$position\t$sink\n"; # Check whether HS has the source (i.e., P) at the given position (i.e +., 82) my $temp = $info{$humangi}; #print "Temp contains $temp" . "\n"; my @char = split //, $temp; #print "Now \@char contains: @char"; #print "Inside the temp: $char[0] and $char[1]\n"; my $target = $char[$position-1]; #print "This is the target: $target" . "\n"; if ( $target eq $source) { print "Yep!\n"; } my @VariantList = (); my @PDList = (); # Scan the rest of the sequences to check what amino acid they have at + the given position for my $gi ( keys %info ) { my $value = $info{$gi}; my @char2 = split //, $value; my $potential = $char2[$position-1]; push (@VariantList, $potential); if ($potential eq $sink){ # Note the cases where we observe th +e sink (i.e., L) at this position my $pd = "$potential" . "{" . "$gi" . "}"; push (@PDList, $pd) #print "A pathogenic deviation has been found at site $pos +ition - from $source to $sink !\n" . " And the corresponding gi for t +his deviation is: $gi\n"; } } print OUTFILE "$Variant\t\t@VariantList\t\t@PDList\n"; } close(INFILE2);