in reply to Re^7: RFC: Is there a solution to the flaw in my hash mechanism? (And are there any others?)
in thread RFC: Is there a solution to the flaw in my hash mechanism? (And are there any others?)
But as I said in an earlier thread, I'd use a b-tree for this ... so it's only a suggestion of something else to think about ;)
Despite my earlier somewhat flippant remark about my history with B-trees, I did give them some thought at that time. This is the logic I used to exclude them for this.
The main advantages B-trees have over binary trees are:
But they still require quite a lot of pointers; which at 8-bytes per is a significant cost of memory.
With on-disk DBs, this leads to a significant saving in expensive reads from disk; but not applicable here.
The increased locality of reference would play to the strengths of cpu caching; but it would require very careful design and tuning to get the most out of that. (Think Judy arrays complexity.)
But in the final analysis; a B-tree is O(logN) for lookups; and any advantages B-tree might have over other trees is only titivating at the edges; some reduction in memory requirement; some potential for reduced cache misses.
My (crude & flawed) test implementation of this hash has demonstrated O(1.021) average lookup probes for 175 million in a 200,000,033 array (87% fill); and both the insert & lookup code is trivial; and very fast.
With a b-tree, the average lookup for 175 million requires 8.25 probes; and the structure would require 3 times the space.
|
|---|