in reply to Re^2: Better Hash Tables?
in thread Better Hash Tables?

If you read my comment, you can see me stating openly that I only react to the article, not to the paper. :-)

I'm not through the paper yet and I admit I would prefer if they stated the algorithm (also) using (pseudo)code rather than (just) by a mix of english and maths, but so far I really do not think we will see this in perl any time soon. The algorithm looks fairly complicated and while I have no reason to dispute the asymptotic behavior, I would expect the real world performance for hashes that are not excessively full to be rather bad and I do believe that for real world applications it will be better to allocate more memory for the table than to complicate the code.

Jenda
1984 was supposed to be a warning,
not a manual!

Replies are listed 'Best First'.
Re^4: Better Hash Tables?
by jo37 (Curate) on Feb 20, 2025 at 07:36 UTC
    I do believe that for real world applications it will be better to allocate more memory for the table than to complicate the code.

    You are missing the point. In a hash table that is filled to a certain degree, there is some effort required to add a new element (or search for an existing one). Seems there was consensus over some decades that a specific algorithm was optimal (but which had not been proved), that was disproved in the paper.

    A non-optimal algorithm - even on an enlarged table - is slower than the optimal one. Just because there is a measure that can easily express a 99.999% filled table does not mean you should use it. The "asymptotic behavior" is asymptotic with the table's size for any given "fillness".

    Would you state there is no point in making an engine more efficient just because you can enlarge the fuel tank?

    Greetings,
    🐻

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

      If you make the engine 1 % more efficient while at the same time it weights ten times as much and costs twenty times as much, I say, please make the tank 1 % bigger. Thank you.

      The algorithm suggested by the paper adds complexity and slows all adds and the advantage is better asymptotic behavior in case of hugely overfilled tables. Using this algorithm would be optimizing for situation that doesn't happen. Any even just remotely sane implementation of hashes increases the size of the table long long before this algorithm might provide any improvement.

      It may make sense in some special cases when the memory is the scarce resource and a slowing down all adds/searches by a constant factor is not the important thing. Possibly when the hash table is a physical processor cache or something, but that's not what Perl is for and it's not where Perl is used.

      Jenda
      1984 was supposed to be a warning,
      not a manual!