hghosh has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I'm fixing a perl script for the lab I work at. This script used to work before our linux cluster underwent some update in 2017 (just for background). I'm workshopping each section of my script. I'm a perl novice, by the way! Everything before this block of code is working. The block of code below should open a file, find the $size of a chromosome from %chrsize, and iterate through a range of (0-$size). Currently, I've identified this strange issue: it will return a completely empty file. If I get rid of the if statement (where $piRNA{$chr}{$index} is set to zero), then the output file is still incorrect. It will return the $chr and $index formatted correctly, but will not print the last value.

open OUT, ">", "$filename.counts"; foreach $chr (keys %chrsize){ $size = $chrsize{$chr}; # chr, size, index work (print statements + show reasonable values) foreach $index (0..$size) { $piRNA{$chr}{$index}=0 if (!exists $piRNA{$chr}{$index}); + print OUT "$chr\t$index\t$piRNA{$chr}{$index}\n"; } }

If I use an if-elsif statement, it still won't work. The only way I've gotten viable results is to specify that the script print to OUT only if $piRNA{$chr}{$index} exists. Can't I simply delete everything under the second foreach statement except for the "print OUT" statement, and wouldn't Perl automatically set "non-existent" keys to zero/undefined?

open OUT, ">", "$filename.counts"; foreach $chr (keys %chrsize){ $size = $chrsize{$chr}; # chr, size, index work (print statements + show reasonable values) foreach $index (0..$size) { if (exists $piRNA{$chr}{$index}) { print OUT "$chr\t$index\t$piRNA{$chr}{$index}\n"; } } }

I'd really appreciate any help or guidance, because my brain is fried.

Replies are listed 'Best First'.
Re: If Condition causes totally blank Output File/Issue in Writing to Output
by haukex (Archbishop) on May 24, 2019 at 17:10 UTC

    Welcome to Perl and the Monastery, hghosh!

    it will return a completely empty file

    If you don't have a use autodie; at the top of the script, then you are not checking for errors in creating the file. The typical way to do this would be e.g. open my $fh, '>', $filename or die "$filename: $!";. Other than that, based on the code you've shown, the only other reason the first piece of code would not produce any output would be that the loops don't run because %chrsize has no entries or all its values are less than 0 - you can check this by using a tool like Data::Dumper to look at the hash (see also the Basic debugging checklist). Also, just to be safe: are you doing a close OUT; at the appropriate time?

    wouldn't Perl automatically set "non-existent" keys to zero/undefined?

    When attempting to access a hash key that does not exist, Perl will return undef. Therefore the two pieces of code are not equivalent: the first sets $piRNA{$chr}{$index} to 0 if there was previously no key $index, and then prints the line unconditionally, while the second piece of code only prints the output line if the key $index exists in the hash %{ $piRNA{$chr} }. In general:

    • exists: In my %foo = ( bar => 1 );, exists $foo{quz} is false, and after a delete $foo{bar}, exists $foo{bar} is also false. A key existing or not has nothing to do with its value - assigning any value to a hash entry will not delete that entry/key, only delete does that.
    • defined: In my %foo = ( bar => undef );, the key bar exists, but testing it with defined $foo{bar} will return false.
    • Truth and Falsehood: A hash entry can exist and be defined, but its value can still be logically false (i.e. when used in an if condition and other "boolean contexts") if its value is undef, 0, "0", or "" (Update: where the first of those, undef, means defined will be false, but it will be true for the rest).
    If I get rid of the if statement ... It will return the $chr and $index formatted correctly, but will not print the last value.

    Sounds to me like $piRNA{$chr}{$index} might be undef or "", which you can see with the aforementioned Data::Dumper, and also, if it's undef you should be seeing warnings about uninitialized values (one of the reasons Use strict and warnings is a best practice). However, if I understand your description correctly, this still doesn't explain the empty output file.

    If you could use a tool like Data::Dumper or Data::Dump to get the contents of %chrsize, then you could use this to build a Short, Self-Contained, Correct Example - a short piece of code that we can run to reproduce the problem on our end.

    Update 2019-08-17: Updated the link to "Truth and Falsehood".

Re: If Condition causes totally blank Output File/Issue in Writing to Output
by bliako (Abbot) on May 25, 2019 at 10:28 UTC

    You have to make a distinction between empty file (size of 0 bytes) and a file full of blanks (tabs and newlines). The latter will have non-zero size.

    In your first code the only way to get a zero-size file would be if %chrsize is empty and foreach $chr (keys %chrsize){ does not loop at all. If $chrsize{$chr} is not defined (i.e. $size=undef) or if it is zero (i.e. $size=0), the foreach $index (0..$size) { will still loop once with $index = 0

    You second code is much more likely to produce a zero-size file.

    BTW, foreach $index (0..3) loops for 0, 1, 2 AND 3

    You really need to use strict; use warnings;

      Hi! I've inherited this script and I've just modified it so use strict will work. Also, thank you for pointing out the distinction between a blank vs empty file. The output file is not empty, but completely blank. I will be back shortly after troubleshooting with both respondents' tips. Thank you once again!
Re: If Condition causes totally blank Output File/Issue in Writing to Output
by hghosh (Acolyte) on May 28, 2019 at 13:02 UTC

    Hi all, A few updates: when I used autodie, I got an error saying "main::OUT is used only once, possible error at line 47." I fixed it by using a scalar handle as shown below. Also, I am getting a blank file (18.7 MB), not an empty one. Despite fixing this error, I still get a blank output file.

    use warnings; my $OUT; open $OUT, ">", "$filename.counts" or die "$filename.counts: $!"; foreach my $chr (keys %chrsize){ my $size = $chrsize{$chr}; # chr, size, index work (print stateme +nts show reasonable values) foreach my $index (0..$size) { #if (exists $piRNA{$chr}{$index}) { # print $OUT "$chr\t$index\t$piRNA{$chr}{$index}\n";} $piRNA{$chr}{$index}=0 if (!exists $piRNA{$chr}{$index}); #print "$piRNA{$chr}{$index}\n" if (exists $piRNA{$chr}{$index +}); # test code I've put in print $OUT "$chr\t$index\t$piRNA{$chr}{$index}\n"; } }
      If there seems nothing wrong with the code and the output is still bad, there must be something wrong with your data. Use Data::Dumper and print a dump of your data hash %piRNA. See if the data meets your expectations.


      holli

      You can lead your users to water, but alas, you cannot drown them.

      Try printing something known on opening the file and again before closing.

      open $OUT, ">", "$filename.counts" or die "$filename.counts: $!"; print $OUT scalar localtime." open\n"; # # code # print $OUT scalar localtime." close\n"; close $OUT;

      Does more code run after the part you posted ?

      poj