mdunnbass has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I am trying to have a look at the elements of arrays in a dynamically created HoHoA. When I try something like this:

my ($span, %matches) = @_; my ($site, $fastaseq, $fasta, $sitekey) = ''; my ($setscounter,$lowerlimit,$upperlimit,$i,$set,$yes,$low,$hit) = 0; my %sets = (); FASTA: for $fastaseq (keys %matches) { $sitekey = ''; SITEKEY: for $sitekey (sort {$a <=> $b } keys %{$matches{$fasta +seq}}) { if (@{$matches{$fastaseq}{$sitekey}}) { SET: foreach $hit (@{$matches{$fastaseq}{$sitekey}}) { print "\$matches{$fastaseq}{$sitekey} is: $matches{$fastas +eq}{$sitekey}\n"; print "\$hit is: ",$hit,"\n"; print "\$hit is: ",$matches{$fastaseq}{$sitekey}[$hit],"\n +"; if ($hit >= $lowerlimit && $hit <= $upperlimit) { #some co +de }}}}}
I get the following output:
$matches{>LG_XIV}{0} is: ARRAY(0x182bb14) Use of uninitialized value in print at MyScript line 386. $hit is: $hit is: 6906413 Use of uninitialized value in numeric ge (>=) at MyScript line 388.
BTW - line 386 is the first print $hit line, and so on. I'm using use strict and use warnings, and I am scratching my head like mad to try to figure out what's wrong. I've been staring at this so long, I am losing my train of thought anymore.

If I replace the straight $hit calls in the greater than comparison with $matches{$fastaseq}{$sitekey}[$hit], I get no change.

Basically, I just want to know what each of the values in the array <code>$matches{$fastaseq}{$sitekey} array is, and then throw it into the comparison on that if (>= && <=) line. But, no dice.

Does anyone have any ideas, or can point out what I'm doing wrong?

Thanks,
Matt

Replies are listed 'Best First'.
Re: foreach in HoHoA problems
by bobf (Monsignor) on Oct 28, 2006 at 04:41 UTC

    It looks like one of the elements of the array is undefined.

    I'm also a little confused about what you're trying to print. For example:

    SET: foreach $hit (@{$matches{$fastaseq}{$sitekey}}) {
    The foreach is iterating over the elements of the array, and $hit is the value of each element, not the index. This is important later.

    print "\$hit is: ",$hit,"\n";
    This line prints the value of $hit. The following line, however,

    print "\$hit is: ",$matches{$fastaseq}{$sitekey}[$hit],"\n";
    uses that value as an index. I don't know how you populated the array, but unless the values of $hit really are index values, you're accessing possibly undefined array elements. In your example above, $hit = 6906413. Do you really have that many elements in the array?

    I suspect you may be having trouble with dereferencing, but I don't know enough about the data struct you have and what you're trying to get out to give you an example. Can you post a partial dump of %matches and an example of what you want to print?

    Update: If you want to iterate over the array using indices instead of the values, you can use something like

    foreach my $index ( 0 .. $#array )
    If you want to simply skip undefined elements in the array, you can use
    next if( not defined $hit );

      Thanks for the help. All of the print statements included are solely for debugging purposes. Eventually, I will be removing them all.

      And you're right, I want to be iterating over the indices of the array, not the values. I certainly hope there aren't 7 million elemnts in the array! I think 70 is far more than I am expecting. As for you're suggestion of:

      foreach my $index ( 0 .. $#array )

      Would I have to write that as:

      foreach my $index ( 0 .. $#matches{$fastaseq}{$sitekey} )?

      As for undefined elements, I don't expect I'll see any. In a previous subroutine, I am pushing the positions of matches to an m// into the arrays in question, (that's what the 7 million coresponds to), so I expect all array elements will be numerical. Whether or not the array is undefined, however, depends entirely on whether or not m// came back as true. That's why I threw the if (@) in there.

      Thanks.
      Matt

        Would I have to write that as:
        foreach my $index ( 0 .. $#matches{$fastaseq}{$sitekey} )?

        You would have to write that as:

        foreach my $index ( 0 .. $#{$matches{$fastaseg}{$sitekey}} )
        Note the extra braces following "$#" and fully surrounding the  $matches{$fastaseg}{$sitekey} -- that latter bit constitutes an array reference, and perl won't be able to parse it as such next to "$#" unles the braces are there to establish the precedence for de-referencing the array ref that is the value of that hash element.

        As for undefined elements, I don't expect I'll see any.

        Yeah, we've all heard that before... (and we've all caught ourselves saying and believing the same thing, in reference to some chunk of code we've written). That's the best thing about making good use of a good debugger: it helps us to get past this kind of fantasy.
Re: foreach in HoHoA problems
by graff (Chancellor) on Oct 28, 2006 at 11:58 UTC
    If you haven't been using Data::Dumper yet, now is the time to start...
    use Data::Dumper; # add this near the top of your script: # ... now, down at line 383 (had been 382 before adding "use Data::Dum +per": SITEKEY: for $sitekey (sort {$a <=> $b } keys %{$matches{$fasta +seq}}) { print "\$matches{$fastaseq}{$sitekey} is:\n", Dumper( $matches{$fastaseq}{$sitekey} ); # add this SET: foreach $hit ( ... ) { # the above print statement used to be here
    Also, if you haven't started using "perl -d" to run your script, it's time to learn about the perl debugger, so you can step through your code and check what is being assigned to your structure as it happens.

    One other hint: if you're looking at data structure trouble at line 384 and you've lost your train of thought, your problem may be "fasta pasta" -- i.e. spaghetti code. (Sorry, couldn't resist.)

    Try modularizing -- identify functions or code blocks that can be made fairly independent, put them into separate files that simply define subroutines, and put "require" statements into the main script to load those other files.

    This will make it easier to test things, and so long as the different "modules" really function independently, you'll find it easier to focus your attention on the problem areas, because they will be smaller. The discipline of not using or altering global variables within subroutines (because the subs will be defined in separate source files), will do you good.

      Thanks, I see references to Data Dumper all over the place, but haven't figured out what it actually is or how to use it yet.

      As for perl -d, I do use that very frequently. Unfortunately in this case, the program I am writing searches a 450 Mb text file line by line searching for specific REs, saving them into %matches. So, going line by line by line by line through that process, before even getting to here would kill me. I there a way to skip the -d for a specific subroutine, yet continue it later in the program? That would help.

      Matt

        D'oh!

        man perldebug helped. I've been using the 's' key to step through my code line by line. I didn't realize I should use the 'r' key on the subroutine that did the 450 Mb file search.

        Phew!. Much better. Debugging is going to go a whole lot faster now....

        Matt