in reply to Re^2: Sorting Hash By Large Number Values
in thread Sorting Hash By Large Number Values

By adding " 'max_count' => 10 " at heap creation time, you can limit the number of elements tracked in the heap. As new elements are added, inferior elements past the top ten will be dropped. That will reduce the amount of time needed to add elements and improve the efficiency of the heap solution. Of course then you don't have a heap full of values, but instead only the top ten, which may limit the datastructure's usefulness later in the script if the heap is needed again.


Dave

Replies are listed 'Best First'.
Re^4: Sorting Hash By Large Number Values
by anonymized user 468275 (Curate) on Aug 03, 2005 at 15:40 UTC
    Good move - that modification convinces me. Although in that case, the underlying algorithm we now have seems too simple to bother with the overheads of a heap module (update: because a hash is a special sort of heap!). The following algorithm uses the bare bones to deliver the top-n into %SortaHeapy, without using any sorting (code updated):
    $NumOfPortsToGet = 10; local $loaded = 0; my %SortaHeapy = (); for my $key ( keys %Stats ) { ($key eq "UnauthOrigin") or SortaHeapy( $key ); } sub SortaHeapy{ my $key = shift; # if mini-heap not full just load it in if ( $loaded < $NumOfPortsToGet ) { $SortAHeapy{ $key } = $Stats{ $key }; $loaded++ return; } # otherwise do the degenerate heap sort/replace for my $hkey ( keys %SortaHeapy ) { if ( $SortaHeapy{ $hkey } < $Stats{ $key } ) { delete $SortaHeapy{ $hkey }; # replace SortaHeapy{ $hkey }; # but iterate the victim return } } }

    One world, one people