Re: Using keys to increase hash efficiency

You might try using a random (or at least less predictable) set of keys, instead of just incrementing by one each time. I don't have any evidence to back this up, but my intuition tells me using sequential keynames isn't really a very good test here. Seed the random number generator with the same seed inside each code block seed to ensure you're getting the same sequence of keys in each.

Also, if you only preallocate to 10000 buckets and you're inserting 10000 keys, it's going to need to reallocate at least once anyway (the largest reallocation, therefore the slowest?). A better test might be to preallocate 20000 buckets, and then do 10000 insertions.

Does anyone know how clever perl's hashing function is? And how does Perl deal with key hash collisions? How feasible is it that Perl might be getting a significant enough number of hash collisions that the number of buckets in the hash is insignificant?

Alan

Update: Thanks for the pointer to the information about the Perl hashing fucntion. After reading that, I can confidently say that in Older versions of Perl, inserting sequential keys would cause a lot of hash collisions, but with the new hashing function, that shouldn't be a problem.

Comment on Re: Using keys to increase hash efficiency

Replies are listed 'Best First'.
RE: Re: Using keys to increase hash efficiency by Anonymous Monk on Jul 29, 2000 at 02:39 UTC
I played around with this thread a bit, but I haven't come to any earth-shattering conclusions yet, so in the meantime: http://www.deja.com/getdoc.xp?AN=650577132&fmt=text This is the best explanation of Perl's hashing I've ever seen. -dlc	[reply]