in reply to Re: Sorting Hash By Large Number Values
in thread Sorting Hash By Large Number Values

Inserting each hash key into a heap doesn't seem to offer a great deal of performance gain if any, given that on the one hand each insert is becoming more and more costly and on the other perl knows it can optimise the sort to take advantage of the hash, whereas the heapify can't do that in this example. I suspect therefore that a great deal more brain-grief would be needed to beat the simplistic sort algorithm even though only the top n are needed.

One world, one people

  • Comment on Re^2: Sorting Hash By Large Number Values

Replies are listed 'Best First'.
Re^3: Sorting Hash By Large Number Values
by davido (Cardinal) on Aug 03, 2005 at 15:31 UTC

    By adding " 'max_count' => 10 " at heap creation time, you can limit the number of elements tracked in the heap. As new elements are added, inferior elements past the top ten will be dropped. That will reduce the amount of time needed to add elements and improve the efficiency of the heap solution. Of course then you don't have a heap full of values, but instead only the top ten, which may limit the datastructure's usefulness later in the script if the heap is needed again.


    Dave

      Good move - that modification convinces me. Although in that case, the underlying algorithm we now have seems too simple to bother with the overheads of a heap module (update: because a hash is a special sort of heap!). The following algorithm uses the bare bones to deliver the top-n into %SortaHeapy, without using any sorting (code updated):
      $NumOfPortsToGet = 10; local $loaded = 0; my %SortaHeapy = (); for my $key ( keys %Stats ) { ($key eq "UnauthOrigin") or SortaHeapy( $key ); } sub SortaHeapy{ my $key = shift; # if mini-heap not full just load it in if ( $loaded < $NumOfPortsToGet ) { $SortAHeapy{ $key } = $Stats{ $key }; $loaded++ return; } # otherwise do the degenerate heap sort/replace for my $hkey ( keys %SortaHeapy ) { if ( $SortaHeapy{ $hkey } < $Stats{ $key } ) { delete $SortaHeapy{ $hkey }; # replace SortaHeapy{ $hkey }; # but iterate the victim return } } }

      One world, one people