in reply to Adding or removing keys while iterating over a hash

OK, I've just found out the answer. Apparently, according to the perl 5.6 version perlfaq4:

lwall In Perl 4, you were not allowed to modify a hash at all while
iterating over it.  In Perl 5 you can delete from it, but you still
can't add to it, because that might cause a doubling of the hash table,
in which half the entries get copied up to the new top half of the
table, at which point you've totally bamboozled the iterator code.
Even if the table doesn't double, there's no telling whether your new
entry will be inserted before or after the current iterator position.

Either treasure up your changes and make them after the iterator finishes,
or use keys to fetch all the old keys at once, and iterate over the list
of keys.
I guess perlmonk's docs must be from 5.005_3. I checked on perl.com as well beforehand, but they have the same answer. Does that mean that perl 5.6 isn't considered authoritative yet?
  • Comment on Re: Adding or removing keys while iterating over a hash

Replies are listed 'Best First'.
Re: Adding or removing keys while iterating over a hash
by Dominus (Parson) on Nov 28, 2000 at 21:30 UTC
    It's odd that it's still so restrictive, because I looked into this pretty carefully and found that although insertion in the middle of an each loop can (and does) cause big problems, there is no way that deletion can cause trouble.

    This is for two reasons. First, unlike insertion, a deletion never causes a complete reorganization of the entire hash. (This is mentioned in the part of the FAQ that you quoted.) But there's also something more subtle going on.

    The way a hash works is that it's a collection of lists of he items. Each he has a pointer to a key, to the associated value, and a pointer to the next he in the list. Every hash also keeps track of the 'current' he. When you call each, it advances the 'current item' pointer to the next he. It can do this because each he points at the next one.

    Normally, when you call delete, the memory for the deleted he is reclaimed right away. But if you delete the 'current item', there's a complication. If the he were destroyed and the memory were reclaimed right away, then the next time you called each, it wouldn't work, because it needs to find the 'next' item, and the information would have been stored in the he that you destroyed.

    So there is special-case code inside the delete function that specifically checks to see if you are deleting the 'current item', and if so, it postpones the destruction of the he until after the next time you call each. There's more special-case code inside of each that checks to see if you already used delete on the current item, and if so, it belatedly destroys the he after advancing the 'current item'.

    In other words, Larry went to some special effort to make sure that you could delete nodes inside of an each loop, so it's particularly odd that this isn't documented.

    In recognition of this, the Perl 5.6.1 the manual will include the following:

    Exception: It is always safe to delete the item most recently returned by each(), which means that the following code will work:
    while (($key, $value) = each %hash) { print $key, "\n"; delete $hash{$key}; # This is safe }
    But I still don't know why arbitrary deletions are warned against, since they do always work.

on 5.5.3, waiting for 5.6.1 (was Re: Re: Adding or removing keys while iterating over a hash)
by merlyn (Sage) on Nov 28, 2000 at 21:11 UTC
    guess perlmonk's docs must be from 5.005_3. I checked on perl.com as well beforehand, but they have the same answer. Does that mean that perl 5.6 isn't considered authoritative yet?
    Most of us with real machines to run are holding at 5.5.3, waiting patiently for 5.6.1 to be released in a few more weeks (as they've kept saying for three months now {grin}), having considered 5.6.0 as only for "early adopters". And of course 5.7.0 was "over the bleeding edge".

    -- Randal L. Schwartz, Perl hacker

      Why are you using 5.5.3 instead of 5.6.0? You say that 5.6.0 is for "early adopters", but personally I don't know any reasons for not using the later versions (not to say that there aren't any reasons; I'm just ignorant of what they are). I'm by no means a perl guru, so as far as I can see 5.6.0 just has some nice features that I would like to use. All the modules I want to use are compatible with 5.6.0, and there are a few little things that I have been using in my scripts like our(), use warnings and delete()ing array elements. None of these are really essential to what I'm doing of course, but they're nice.

      I'm quite happy to accept that 5.6.0 isn't ready for use on production machines, but could someone tell me why, or where to find out why? Thanks in advance for any pointers.

      (Apologies for the late reply, but I haven't really been in a position to until now.)

        Try this:
        perl -e 'my $x = 10; $x = "2" . $x; print $x + 0'
        With 5.6.0 this prints 10.

        There are many other things wrong with 5.6.0, but this one is generally good enough to make the point.