Re^3: Designing a cache data structure

This is untrue. As more elements get added to perl hashes, more buckets get added and the chances of hash collisions increase. Both of these aspects lead to a logarithmic increaese in the amount of work needed to acces the average key-value pair. You could always predeclare a large hash or array to get rid of pregressive bucket costs, but this only hides the loagirthmic increase in a large constant mutiplier; for sufficiently large N, it will start growing logarithmically again.

The good news is that logarithmic growth is not bad at all, and people use larges hashes and sort large arrays all the time without much problem.

-Mark

Comment on Re^3: Designing a cache data structure

Replies are listed 'Best First'.

Re^4: Designing a cache data structure
by Jenda (Abbot) on Feb 20, 2005 at 21:12 UTC

The chances of hash collisions increase only as long you add items without adding buckets, as soon as you add (double) the buckets the chance drops! By predeclaring the hash you get rid only of the several rehashes.

Jenda
We'd like to help you learn to help yourself
Look around you, all you see are sympathetic eyes
Stroll around the grounds until you feel at home
-- P. Simon in Mrs. Robinson

[reply]

Re^4: Designing a cache data structure
by spurperl (Priest) on Feb 21, 2005 at 06:10 UTC

Hashes with intelligent management adapt themselves to growth. It's best when you know exactly what's the data, of course. In my case, as the cache is size limited anyway, the hash can be "true O(1)". If the required cache size is N, making a hash with ~ 2N buckets should make practically all accesses O(1).

[reply]