I don't know about perl's implementation specifically, so I'm just explaining the general concept.
An associative array (such as a Perl hash) is implemented as an array. To figure out into in which element of the array -- called buckets -- a key belongs, a function is applied to the key. The function always returns the same number for a given key, but the number is not unique among all keys. In other words, a hashing algorithm is applied to the key.
For example, given a hash of 8 buckets, the hasing function could return
3 for key "foo"
4 for key "bar"
3 for key "bla"
The structure would look like:
@hash_guts = (
undef,
undef,
undef,
[ $bla_node, $foo_node ],
[ $bar_node ],
undef,
undef,
);
Where a node contains the key and the value:
bless([$key, $val], Node)
To find $hash{"foo"}, "foo" is passed through the hashing algorithm. Then, the list in the appropriate bucket is scanned for the appropriate key. Once the node with the right key is found, the value is returned.
In the above example, scalar(%hash) would return "2/8". (2 buckets out of 8 are holding 3 entries.) So what does that mean? Contemplate searching for node7 in these two hashes:
@hash1_guts = ( # scalar(%hash1) is 2/8
undef,
[ $node1 ],
undef,
undef,
undef,
undef,
[ $node2, $node3, $node4, $node5, $node6, $node7 ],
undef,
);
@hash2_guts = ( # scalar(%hash2) is 6/8
[ $node1 ],
[ $node2 ],
[ $node3 ],
[ $node4, $node5 ],
[ $node6 ],
undef,
undef,
[ $node7 ],
);
The second hash would return a result much more quickly because it doesn't have to search through a long list after finding the right bucket. The closer the numerator of scalar(%hash) is to the numbers of keys, the less time the average lookup will take.
Now consider adding another element to the second hash. It's almost guaranteed to make a list longer, so more buckets are added for future growth:
@hash2_guts = ( # scalar(%hash2) is now 7/16
[ $node1 ],
undef,
[ $node2 ],
undef,
[ $node3 ],
undef,
[ $node4, $node5 ],
undef,
[ $node6 ],
undef,
undef,
undef,
undef,
[ $node8 ],
[ $node7 ],
undef,
);
|