in reply to *Fastest* way to print a hash sorted by value
This is what a "heap sort" is great for (finding and sorting the top N items of a very large list). I keep wanting to implement this in Perl. I don't think the Heap modules include this functionality (which I find a bit strange, but *shrug*, maybe I'm just not seeing it -- the modules are a bit more complicated than I expect so I either didn't dig very deep into them or I just don't at the moment remember having done so).
In any case, the technique isn't horribly difficult and I'm sure google will find you several good tutorials on it.
Update: So far, all of the other responses are waving their hands vaguely around "keep track of the top N items". That sounds simple enough but some ways are certainly more efficient for doing this than others.
The first naive method would be to keep the top N items unsorted in a list and just compare new items as you get them against every single item in the list until you find one that you can replace with the new one, or finish comparing and reject the new one. That is a ton of compares, perhaps even more than just sorting the full list.
The second naive method would be to keep the top N items in sorted order so that you can do a single compare to tell whether to reject the new item. But, if you decide to keep it, then you have to insert it into the list in sorted order. That will take you log(N) compares (if you use a binary search) and then you'll have to shift about 1/2 of the list over to fit it in (or use something like a linked list which isn't very Perlish and so are a bit slower).
The heap sort keeps the top N items in an array in a partially sorted order. You can still tell whether a new item should be insert or not with a single comparison. However, the algorithm tells you how to only use log(N) comparisons and log(N) swappings of items to insert a new item while preserving the partial ordering. Then it tells you how to pull the items out of the "heap" so that they all come out in fully sorted order.
- tye
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: *Fastest* way to print a hash sorted by value (heap)
by tilly (Archbishop) on Aug 08, 2003 at 22:33 UTC |