in reply to Re^2: RFC - Tie::Hash::Ranked
in thread RFC - Tie::Hash::Ranked
I am not sure what you mean WRT a pairwise comparison function.The problem is not being able to sort using complicated orderings. That you have accomplished. But without a pairwise comparison function, you have to re-sort the entire collection every time an item is inserted. A better implementation would either do binary search on the sorted array to find the insertion point, or (if you're worried about splice not being constant time) use your favorite balanced tree structure instead of an underlying array. Both operate using pairwise comparisons, and take O(log n) for insertion/deletion, instead of O(n log n) as the module currently does.
That's a huge difference -- huge enough that it's not just a "need-for-speed" optimization. In fact, it's generally accepted that common (non-catastrophic) hash operations be no more than O(log n), as I think tilly was alluding to. And think about it -- even just naively looping through an unsorted array for every operation could accomplish all the ranking stuff you need in O(n) time! So sorting for every insertion is definitely a step backwards.
Update: I've looked at the code of the module. In fact it does not necessarily sort after every insertion, but you can easily construct a sequence of operations on the hash so that it does. Alternate STORE and FIRSTKEY operations k times and the total cost is O(kn log n). Using either approach mentioned above, the same sequence of operations costs O(k log n). So while your optimizations are nice, they don't actually help asymptotically, unless you perform O(n log n) insert/delete/fetch operations in between every call to keys. For many uses of a hash, this condition holds, but for a general tool I think I'd rather have all operations logarithmic ;)
blokhead
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: RFC - Tie::Hash::Ranked
by Limbic~Region (Chancellor) on Oct 12, 2004 at 16:38 UTC |