Hash Internals

orthanc has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I've been working a lot with hashes recently and as you all know you can't get stuff out in insertion order unless you use Tie::IxHash.

I know you will all probably say you should use arrays if you want a certain order, but what if you want the hash functionality?

The main point of this question being if perl uses its own indexing method to store the hash internally, why not just have it tied internally? Does it make a difference?

Any suggestions?

Orthanc

Comment on Hash Internals

Replies are listed 'Best First'.
Re: Hash Internals by japhy (Canon) on Oct 19, 2000 at 16:13 UTC
The ordered hash structure uses three things: a hash of key-index pairs an array of keys an array of values Thus: `$orderedhash{$newkey} = $newvalue; # is really doing $next_index = @KEYS; $KEYINDEX{$key} = $next_index; push @KEYS, $key; push @VALUES, $value; $orderedhash{$oldkey} = $newvalue; # is really doing $key_index = $KEYINDEX{$key}; $VALUES[$key_index] = $value;` [download] Deletions require splicing the two arrays, and then lowering the values in the `%KEYINDEX` hash. And if you want to allow for in-the-middle insertion, that requires splicing again, and raising the values in the hash. This is obviously a slow structure and a slow process. It does not make for optimization. Hashes are designed so that keys can be looked up quickly, and Perl's hash structure might change soon to make Perl realize that a non-existent key isn't in the hash sooner (that's good). `$_="goto+F.print+chop;\n=yhpaj";F1:eval`	[reply] [d/l]
RE: Re: Hash Internals by d_i_r_t_y (Monk) on Oct 23, 2000 at 02:22 UTC
what is wrong with an implementation that just uses a hash and a list to keep the order of keys? FETCH-ing should be faster (a single hash lookup); STORE-ing \ would be faster (only replace the hash value), KEYS would be the same, VALUEs would be a little slower, EACH also slightly slower... i guess the big question is whether given an existing key-value, does replacing that key's value keep its original position or does it move to the 'end'?	[reply]
RE: RE: Re: Hash Internals by japhy (Canon) on Oct 23, 2000 at 02:31 UTC
That's ok for a limited design, where you'll only be adding to the front or back. `$_="goto+F.print+chop;\n=yhpaj";F1:eval`	[reply]
RE: RE: RE: Re: Hash Internals by d_i_r_t_y (Monk) on Oct 24, 2000 at 07:53 UTC
RE: RE: RE: RE: Re: Hash Internals by japhy (Canon) on Oct 24, 2000 at 16:06 UTC
Some notes below your chosen depth have not been shown here
Re: Hash Internals by orthanc (Monk) on Oct 19, 2000 at 16:30 UTC
So is efficiency the main reason why the hashes are not tied internally? And if so, does Tie::IxHash slow things down a lot? and i'm asking this question assuming that the algorithm mentioned above is similar to the method used in the Tie::IxHash.	[reply]
RE: Re: Hash Internals by japhy (Canon) on Oct 19, 2000 at 16:36 UTC
Yes, the method I described is VERY similar to Tie::IxHash, if not the same. I wrote my own implementation to see if I understood it (and did a presentation on it at YAPC 19100. Ordered hashes are slow because you totally disregard the work the hashing algorithm does to make the accesses fast. And because the setup uses those three data structures, and there's not one that maps key to value, you have to first use the hash to get key to index, and then use the array of values to get index to value. Ick. `$_="goto+F.print+chop;\n=yhpaj";F1:eval`	[reply]