Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I have written the following code that works OK (i.e. prints the correct results):
use stricts; use warnings; my $total_entry_non_barrel=''; my $aligns_non_barrel=''; my @split_aligns_non_barrel=(); my $separate_hit_non_barrel=''; my @split_hit_non_barrel=(); my $hit_line_non_barrel=''; my $hit_name_non_barrel=''; my $protein_non_barrel=''; my $stat_line_non_barrel=''; my @split_stats_non_barrel=(); my $score_non_barrel=''; my $conditional_evalue_non_barrel=''; my $env_start_non_barrel=''; my $env_end_non_barrel=''; my $i=0; my $k=0; $/="//\n"; while(<>) { $total_entry_non_barrel=$_; if($total_entry_non_barrel=~/^Query:\s+(.*?)\s\[L=(\d+)\]/m) { $protein_non_barrel=$1; } if ($total_entry_non_barrel!~/No hits detected/) { if($total_entry_non_barrel=~/Domain annotation for each model: +(.*)Internal pipeline statistics summary:/s) { $aligns_non_barrel=$1; @split_aligns_non_barrel=split(">> ", $aligns_non_barrel); for($i=0; $i<=$#split_aligns_non_barrel; $i++) { $separate_hit_non_barrel=$split_aligns_non_barrel[$i]; @split_hit_non_barrel=split(/\n/, $separate_hit_non_ba +rrel); $hit_line_non_barrel=$split_hit_non_barrel[0]; if($hit_line_non_barrel=~/^(.*?)\s+/) { $hit_name_non_barrel=$1; } #keep score, c-Evalue, env-start, env-end for each hit for($k=3; $k<=$#split_hit_non_barrel; $k++) { $stat_line_non_barrel=$split_hit_non_barrel[$k]; @split_stats_non_barrel=split(/\s+/, $stat_line_no +n_barrel); $score_non_barrel = $split_stats_non_barrel[3]; $conditional_evalue_non_barrel = $split_stats_non_ +barrel[5]; $env_start_non_barrel = $split_stats_non_barrel[13 +]; $env_end_non_barrel = $split_stats_non_barrel[14]; } if($hit_name_non_barrel=~/^PF\d+/) #it is a PFA +M-A hit, keep it if score >=TC { if($score_non_barrel>=$hash_pfamA_tc{$hit_name_non +_barrel}) { print $hit_name_non_barrel, "\t", $score_non_b +arrel, "\t", $conditional_evalue_non_barrel, "\t",$env_start_non_barr +el , "\t", $env_end_non_barrel, "\n"; } } elsif($hit_name_non_barrel=~/^PB\d+/) #it is a Pfam +B hit, keep it if Evalue<=1e-6 { if($conditional_evalue_non_barrel<=0.000001) { print $hit_name_non_barrel, "\t", $score_non_b +arrel, "\t", $conditional_evalue_non_barrel, "\t",$env_start_non_barr +el , "\t", $env_end_non_barrel, "\n"; } } } } } print "##\n"; } $/="\n";

However, when I run it, I get the warning:
Use of uninitialized value $hit_line_non_barrel in pattern match (m//) + at ttt.pl line 735

which is the line:
if($hit_line_non_barrel=~/^(.*?)\s+/)

in my program. What am I doing wrong?
Also, is there some tips you could tell me/re-write some commands etc that could make my code run faster?

Replies are listed 'Best First'.
Re: Can you tell me why I get this error and if I can optimize my code somehow?
by SuicideJunkie (Vicar) on Jun 20, 2014 at 19:04 UTC

    Start by trimming your input data in half until you find a minimal set of input that causes the issue.

    If you don't see a bad entry in the input remaining, then sprinkle some print statements through your code, and see where and when that undef is appearing.

    PS: If you rewrite

    for($i=0; $i<=$#split_aligns_non_barrel; $i++) { $separate_hit_non_barrel=$split_aligns_non_barrel[$i]; ...
    as
    for my $separate_hit_non_barrel (@split_aligns_non_barrel) { ...
    Then that would simplify your code a fair bit and make it easier to read. Fewer pseudo-global variables and unnecessary worrying about indexes is good too.

Re: Can you tell me why I get this error and if I can optimize my code somehow?
by toolic (Bishop) on Jun 20, 2014 at 18:57 UTC
    Is that really the code you are using?

    I get this compile error:

    Can't locate stricts.pm in @INC

    If I change stricts to strict, I get this compile error:

    Global symbol "%hash_pfamA_tc" requires explicit package name at
Re: Can you tell me why I get this error and if I can optimize my code somehow?
by hexcoder (Curate) on Jun 20, 2014 at 21:31 UTC
    Probably because you used split with an expression that were empty or contained only newlines \n here:
    @split_hit_non_barrel=split(/\n/, $separate_hit_non_barrel);

    Since split absorbs its seperators, it would then return an empty list.
    The first entry of an empty array would give an undef value:

    $hit_line_non_barrel=$split_hit_non_barrel[0];

    Using an undef value for a match would give a warning then:

    $hit_line_non_barrel=~/^(.*?)\s+/
    This is just guessing, using the Perl debugger would be much better. You can even customize the debugger, so it stops right at the point when the warnings occurs -> Re: Debugging a program and Re^3: Debugging a program allowing you to look what has gone wrong while having the whole context available. I am using this regularly with my code.
      Thank you very much, hexcoder, for these very useful links on a debugging technique that I did not know so far and can save tremendous amount of debugging time. I for sure will be using it all the time from now on. I wish I could upvote your post more than once.
Re: Can you tell me why I get this error and if I can optimize my code somehow?
by Laurent_R (Canon) on Jun 20, 2014 at 20:00 UTC
    Start by trimming your input data in half until you find a minimal set of input that causes the issue.

    Yes, that's the right method when you don't know where the issue occurs (for example if you first slurped the file into a scalar or an array, then you get the line number of the last line of the file), but in this case, I am a bit surprised that the warning does not say on which line of the input file the issue occurred. To the OP: did you give the full warning?