thomc has asked for the wisdom of the Perl Monks concerning the following question:

I'm iterating through an array of arrayrefs and find that the value of $#{$eventscores} changes at the end of the inner while loop (I've printed it out to check). Here is my code
# $eventscores is an arrayref to an array of arrayrefs. # everything works fine until the last while iteration while ( $i <= $#{$eventscores} ) { $rider_id = $eventscores->[$i]->[0]; while ( $eventscores->[$i]->[0] == $rider_id ){ push (@scores, $eventscores->[$i]->[2] ); $i++; } #...more code.. unrelated to loop }
This will go into an endless loop because the value of $#{$eventscores) changes when the value of $i is incremented beyond it (where it should end) Here is the key dilemma for me. If I assign
$array_count = $#{$eventscores}
then set up the loop as
while ($i <= $array_count)
it works. Why? Thanks

Replies are listed 'Best First'.
Re: $#{$array_ref} changes in loop
by Sidhekin (Priest) on Apr 18, 2007 at 03:04 UTC

    You are bitten by autovivification:

    while ( $eventscores->[$i]->[0] == $rider_id ){

    If $i at this time points beyond the end of the array, an arrayref will spring into existence, as you dereference the hitherto unexisting element.

    Solution: Don't do that. A common strategy is to protect the dereferencing:

    while ( $eventscores->[$i] and $eventscores->[$i]->[0] == $rider_id ){

    Depending on the rest of your outer loop, other solutions may be cleaner though.

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

Re: $#{$array_ref} changes in loop
by Trizor (Pilgrim) on Apr 18, 2007 at 02:59 UTC

    This is an autovivication trap. Perl autovivifies (creates) missing elements of refrences for hash and array refs, so every time you get $eventscores->[$i]->[2] you'll at least get something, either your desired variable or undef. So if you overrun your array before you run out of data $rider_id will get set to undef and then the inner loop will become infinite because the autovivified undef's will of course be equal to undef.

    While perl will handle most bounds checking for you this is one situation you have to do it yourself.

Re: $#{$array_ref} changes in loop
by GrandFather (Saint) on Apr 18, 2007 at 03:16 UTC

    Don't do it that way! The code is clearer and more maintainable (and actually works) if you:

    for my $i ($i .. $#{$eventscores} ) {

    or:

    for my $eventscore (@$eventscores} ) { $rider_id = $eventscore->[0]; while ( $eventscore->[0] == $rider_id ){ push (@scores, $eventscore->[2] ); $i++; } #...more code.. unrelated to loop }

    as appropriate.

    Update: Completely bogus code - doesn't do what the OP's code intended!


    DWIM is Perl's answer to Gödel
Re: $#{$array_ref} changes in loop
by Moron (Curate) on Apr 18, 2007 at 13:57 UTC
    You have two loops dependent on the dimensions of the same array and only one control variable that is set up to exceed its proper bounds, where two iteration variables would work ok - the autovivication effect can better be thought of as a symptom of incorrect loop construction - i.e. don't try to address that problem on its own, but address the more abstract one of loop construction to solve this elegantly and maintainably.

    There are a number of rules for constructing loops safely in any language that all apply here. Another rule is:

    Don't modify any of the control variables from within a loop, whether it be the iterator or the start and stop limits or the compiler or interpreter is apt to stumble and behave unpredictably".

    It takes experience to know more such rules but they relate to loop construction technique more than anything else. Three of the lines of code look functionally to be exactly the three parts of a 3-arg for loop, so better just write that, e.g.:

    for my $i ( 0 .. $#$eventscores ) { # forces readonly for ( my $j = $i; $eventscores->[$j][0] == $eventscores->[$i][0]; $j++ ){ push (@scores, $eventscores->[$j][2] ); } # more code unrelated to loop }
    Update: or if you didn't really want to exit the innerloop on a non match (I find that functionally a bit weird though you might possibly be right about it) ...
    for my $i ( 0 .. $#$eventscores ) { # forces readonly for ( my $j = $i; $j <= $#$eventscores; $j++ ){ if ( $eventscores->[$j][0] == $eventscores->[$i][0] ) { push (@scores, $eventscores->[$j][2] ); } } # more code unrelated to loop }
    __________________________________________________________________________________

    ^M Free your mind!

Re: $#{$array_ref} changes in loop
by chakram88 (Pilgrim) on Apr 18, 2007 at 13:04 UTC
    As others have noted, you ran into auto-vivification. I was bit by the same thing not too long ago, and ferreira provided a great example with several links to discussions of the issue.

    Hopefully these nodes will help you as they did me.

Re: $#{$array_ref} changes in loop
by Jenda (Abbot) on Apr 19, 2007 at 08:41 UTC

    Maybe there is a reason for this, but this looks like you should use a different data structure. If I understand it right your AoA looks somewhat like this:

    $eventscores = [ [riderid, actionid, score], [riderid, actionid, score], [riderid, actionid, score], ... ];
    where neither the riderid nor actionid is unique. And you want to find all scores for each rider assuming the riders are in order. If you can assume that much you might be better off with a datastructure like this:
    $eventscores = { riderid => { actionid => score actionid => score ... }, ... };
    in which case your code would become
    foreach my $rider (keys %$eventscores) { my @scores = values %{$eventscores->{$rider}}; # or, if you need the scores in order my @scores = map {$eventscores->{$rider}{$_}} sort keys %{$eventscore +s->{$rider}}; # do something with $rider and @scores }
    I don't know what are you doing with the @scores later, but in either case I do think you should consider starting to use a database. If eg. you wanted to get an average score for each rider you could just run a query like this
    select riderid, AVG(score) from eventscores group by riderid order by riderid
    and be done with it. If you do not want to have to install a server just use DBD::SQLite.

    If you don't know SQL yet, learn it. It'll make many things much easier.