Adding or removing keys while iterating over a hash

mischief has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Adding or removing keys while iterating over a hash by mischief (Hermit) on Nov 28, 2000 at 15:27 UTC
OK, I've just found out the answer. Apparently, according to the perl 5.6 version perlfaq4: lwall In Perl 4, you were not allowed to modify a hash at all while iterating over it. In Perl 5 you can delete from it, but you still can't add to it, because that might cause a doubling of the hash table, in which half the entries get copied up to the new top half of the table, at which point you've totally bamboozled the iterator code. Even if the table doesn't double, there's no telling whether your new entry will be inserted before or after the current iterator position. Either treasure up your changes and make them after the iterator finishes, or use keys to fetch all the old keys at once, and iterate over the list of keys. I guess perlmonk's docs must be from 5.005_3. I checked on perl.com as well beforehand, but they have the same answer. Does that mean that perl 5.6 isn't considered authoritative yet?	[reply]
Re: Adding or removing keys while iterating over a hash by Dominus (Parson) on Nov 28, 2000 at 21:30 UTC
It's odd that it's still so restrictive, because I looked into this pretty carefully and found that although insertion in the middle of an `each` loop can (and does) cause big problems, there is no way that deletion can cause trouble. This is for two reasons. First, unlike insertion, a deletion never causes a complete reorganization of the entire hash. (This is mentioned in the part of the FAQ that you quoted.) But there's also something more subtle going on. The way a hash works is that it's a collection of lists of `he` items. Each `he` has a pointer to a key, to the associated value, and a pointer to the next `he` in the list. Every hash also keeps track of the 'current' `he`. When you call `each`, it advances the 'current item' pointer to the next `he`. It can do this because each `he` points at the next one. Normally, when you call `delete`, the memory for the deleted `he` is reclaimed right away. But if you delete the 'current item', there's a complication. If the `he` were destroyed and the memory were reclaimed right away, then the next time you called `each`, it wouldn't work, because it needs to find the 'next' item, and the information would have been stored in the `he` that you destroyed. So there is special-case code inside the `delete` function that specifically checks to see if you are deleting the 'current item', and if so, it postpones the destruction of the `he` until after the next time you call `each`. There's more special-case code inside of `each` that checks to see if you already used `delete` on the current item, and if so, it belatedly destroys the `he` after advancing the 'current item'. In other words, Larry went to some special effort to make sure that you could delete nodes inside of an `each` loop, so it's particularly odd that this isn't documented. In recognition of this, the Perl 5.6.1 the manual will include the following: Exception: It is always safe to delete the item most recently returned by `each()`, which means that the following code will work: `while (($key, $value) = each %hash) { print $key, "\n"; delete $hash{$key}; # This is safe }` [download] But I still don't know why arbitrary deletions are warned against, since they do always work.	[reply] [d/l]
on 5.5.3, waiting for 5.6.1 (was Re: Re: Adding or removing keys while iterating over a hash) by merlyn (Sage) on Nov 28, 2000 at 21:11 UTC
guess perlmonk's docs must be from 5.005_3. I checked on perl.com as well beforehand, but they have the same answer. Does that mean that perl 5.6 isn't considered authoritative yet? Most of us with real machines to run are holding at 5.5.3, waiting patiently for 5.6.1 to be released in a few more weeks (as they've kept saying for three months now {grin}), having considered 5.6.0 as only for "early adopters". And of course 5.7.0 was "over the bleeding edge". -- Randal L. Schwartz, Perl hacker	[reply]
Re: on 5.5.3, waiting for 5.6.1 (was Re: Re: Adding or removing keys while iterating over a hash) by mischief (Hermit) on Nov 30, 2000 at 15:39 UTC
Why are you using 5.5.3 instead of 5.6.0? You say that 5.6.0 is for "early adopters", but personally I don't know any reasons for not using the later versions (not to say that there aren't any reasons; I'm just ignorant of what they are). I'm by no means a perl guru, so as far as I can see 5.6.0 just has some nice features that I would like to use. All the modules I want to use are compatible with 5.6.0, and there are a few little things that I have been using in my scripts like our(), use warnings and delete()ing array elements. None of these are really essential to what I'm doing of course, but they're nice. I'm quite happy to accept that 5.6.0 isn't ready for use on production machines, but could someone tell me why, or where to find out why? Thanks in advance for any pointers. (Apologies for the late reply, but I haven't really been in a position to until now.)	[reply]
Re (tilly) 2: on 5.5.3, waiting for 5.6.1 (was Re: Re: Adding or removing keys while iterating over a hash) by tilly (Archbishop) on Nov 30, 2000 at 17:00 UTC
Re: Re (tilly) 2: on 5.5.3, waiting for 5.6.1 (was Re: Re: Adding or removing keys while iterating over a hash) by snax (Hermit) on Nov 30, 2000 at 17:09 UTC
Re: Adding or removing keys while iterating over a hash by dws (Chancellor) on Nov 28, 2000 at 23:02 UTC
The other reason not to do it is that you're liable to confuse the heck out of whoever picks up the code next. The remove case is easy to understand, but it's not at all clear to the average reader (who understands that hashes are unordered) what it means to do insert a new key in the middle of an iteration. "Do I see the key again later in the iteration? Uh... Uh... We'll have to run it and see." I hate being backed into "we'll have to run it and see" mode, and try not do to that to others.	[reply]