in reply to Re: Re: Re: Re: print scalar %hash... erm
in thread print scalar %hash... erm

As pileswasp pointed out, you can preallocate your hash by assigning to keys, similar to preallocating an array by assigning to $#array.

Most of the time, of course, you'll just let Perl worry about the number of buckets. Here's how that works...

When you add a new key to the hash, perl checks whether there are too many keys for the number of buckets. If that's the case, perl allocates a bunch more buckets. Then, perl goes through the whole hash, calls the hash function for each key, and figures out where it belongs in the newly-resized hash.

You're probably thinking that resizing the hash and reinserting all the elements is slow; you're absolutely right. When perl has to resize a hash, it doubles the number of buckets, which keeps it from having to resize your hashes too often.

  • Comment on Re: Re: Re: Re: Re: print scalar %hash... erm

Replies are listed 'Best First'.
(tye)Re: print scalar %hash... erm
by tye (Sage) on May 25, 2001 at 05:03 UTC

    Actually, Perl doesn't have to call the hash function for each key as the full hash value is stored along with the key so Perl just needs to compute the remainder between that value and the new number of buckets.

    Another question is whether Perl decides to allocate more buckets based on the number of buckets in use or on the number of keys. I remember being surprised by the answer to this and vaguely remember that the answer has changed. Unfortunately I don't have the time right now to dig this information up.

            - tye (but my friends call me "Tye")
      When you do have time I would be very interested to take a look.

        I'm glad I looked into this again. It appears to be a bug (well, it is at least an "interesting" algorithm). Here is my test code:

        #!/usr/bin/perl -w use strict; my $key='a'; my %check=($key=>1); # Used to find keys that hash similarly my %fill= %check; # Used to show hash isn't expanded my %grow= %check; # Used to show when hash does expand $|++; while( 1 ) { $grow{$a}= $check{++$a}= 1; if( 1 == substr(%check,0,1) ) { $fill{$a}= 1; print %grow." ".%check." ".%fill." ".keys(%fill).$/; } else { delete $grow{$a}; } delete $check{$a}; }
        and the start of the output:
        1/8 1/8 1/8 2 1/8 1/8 1/8 3 1/8 1/8 1/8 4 1/8 1/8 1/8 5 1/8 1/8 1/8 6 1/8 1/8 1/8 7 2/16 1/8 1/8 8 2/16 1/8 1/8 9 2/16 1/8 1/8 10 2/16 1/8 1/8 11 2/16 1/8 1/8 12 2/16 1/8 1/8 13 2/16 1/8 1/8 14 2/16 1/8 1/8 15 4/32 1/8 1/8 16
        which indicates that it is the number of keys (not the number of full buckets) that determines whether the hash should be extended to include twice as many buckets. But, this check is only done when a previously empty bucket gets an item. (Sorry, I'd normally explain this in more detail but I'm still a bit pressed for time.)

        I'll have to find some time to look at the source code...

        Update: I submitted a patch to fix this but it was mostly not understood and so I'm pretty sure it didn't get included. *sigh*

        Update: Many months later I tried again and managed to get the patch accepted that time.

                - tye (but my friends call me "Tye")