in reply to Limit the size of a hash

And here my version based on insertion sort, complete with random data generation and testing:

use strict; use warnings; sub nextrecord { return undef unless shift; # generate random records return { code => join( " ", (int rand 10), (int rand 10), (in +t rand 10), (int rand 10) ), score => rand }; } my $keep = 10; my $counter = 1000; # initialize with the first 10 records sorted my @largest = sort { $a->{score} <=> $b->{score} } map nextrecord( $co +unter-- ), 1..$keep; my @test = @largest; # for testing while( my $next = nextrecord( $counter-- ) ){ push @test, $next; # keep all for testing my $pos = 0; # insertion sort while(($pos < $keep) && ($largest[$pos]->{score} < $next->{sco +re}) ) { $largest[$pos-1] = $largest[$pos] if $pos; $pos++; } $largest[$pos-1] = $next if $pos; } print map { "$_->{code}\t$_->{score}\n" } @largest; # test print "\n"; print map { "$_->{code}\t$_->{score}\n" } ( sort { $a->{score} <=> $b- +>{score} } @test )[-10..-1];

Replies are listed 'Best First'.
Re^2: Limit the size of a hash
by Random_Walk (Prior) on Sep 05, 2013 at 13:37 UTC

    I just wrote my own with insertion sort. Looks like you beat me to it. This one no longer needs to initialise and sort the store array with real data, it just fills it with the correct number of zero values.

    use strict; use warnings; use Data::Dumper; my $keep = 5; my @keeper; push @keeper, [0] for 1 .. $keep; while (<DATA>) { my @vals = split; next unless @vals; # possibly more sophisticated test my $score = pop @vals; addlist ([$score, \@vals], \@keeper); } print Dumper \@keeper; # slide up the list knocking down existing records # until we no longer have the biggest sub addlist { my ($rec, $list) = @_; for my $i (0 .. $#$list) { if ($rec->[0] > $list->[$i]->[0]) { $list->[$i-1] = $list->[$i] if $i; $list->[$i] = $rec; } else { last; } } } __DATA__ 0 5 3 7 0.97 1 3 4 8 0.18 1 4 4 8 0.21 1 7 4 4 0.22 1 8 4 8 0.52 1 9 4 1 0.16 1 5 1 8 0.18 1 5 2 9 0.72 1 5 5 8 0.32 1 5 6 6 0.17 1 5 7 4 0.52 1 5 9 3 0.92 1 5 4 8 0.99 6 8 4 6 0.54

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!

      I am usually careful with default values. Here you implicitely assume that all scores are positive, otherwise the zeroes could influence the result...

        Good point. Here is a better version that does not have to define this list first and does not make any assumptions about the size values.

        #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $keep = 5; my @keeper; while (<DATA>) { my @vals = split; next unless @vals; # possibly more sophisticated test my $score = pop @vals; addlist ([$score, \@vals], \@keeper); } print Dumper \@keeper; # slide up the list knocking down existing records # until we no longer have the biggest sub addlist { my ($rec, $list) = @_; for my $i (0 .. $keep-1) { if (not $list->[$i] or $rec->[0] > $list->[$i]->[0]) { $list->[$i-1] = $list->[$i] if $i; $list->[$i] = $rec; } else { last; } } } __DATA__ 0 5 3 7 0.97 1 3 4 8 0.18 1 4 4 8 0.21 1 7 4 4 0.22 1 8 4 8 0.52 1 9 4 1 0.16 1 5 1 8 0.18 1 5 2 9 0.72 1 5 5 8 0.32 1 5 6 6 0.17 1 5 7 4 0.52 1 5 9 3 0.92 1 5 4 8 0.99 6 8 4 6 0.54

        Cheers,
        R.

        Pereant, qui ante nos nostra dixerunt!