Eliya:

I've modified your solution a bit, as I was uncomfortable with sorting the array for every iteration. For some data sets (very large datasets and/or skewed distributions) the sorting shouldn't be required frequently.

#!/usr/bin/perl use strict; use warnings; use autodie; use feature ':5.10'; my $top_N = 10; my $gen_score = sub { 1000 * rand() * rand () }; my $noisy = 0; # 0=quiet, 1=show repl, 2=show add, 3=all traces # Generate temp data if (@ARGV or ! -e 'TEMP') { my $row_count = shift // 100; say "Building temp file with $row_count items" if $noisy; open my $FH, '>', 'TEMP'; printf $FH "ID_%04u %.2f\n", $_, &$gen_score() for 1 .. $row_count +; close $FH; } my @top; my ($cnt_add, $cnt_replace) = (0)x2; open my $FH, '<', 'TEMP'; while (<$FH>) { my ($id, $score) = split ' '; if (@top<$top_N) { @top = sort { $a->[1] <=> $b->[1] } @top, [ $id, $score]; say "$.: [$id, $score] ADD => ", scalar(@top), " $top[0][0]:$top[0][1] .. $top[-1][0]:$top[-1][1]" if $noisy > 1; ++$cnt_add; } elsif ($score > $top[0][1]) { @top = sort { $a->[1] <=> $b->[1] } [$id, $score], @top[1 .. $ +#top]; say "$.: [$id, $score] REPL => ", scalar(@top), " $top[0][0]:$top[0][1] .. $top[-1][0]:$top[-1][1]" if $noisy; ++$cnt_replace; } else { say "$.: [$id, $score] < [ $top[0][0], $top[0][1] ]" if $noisy > 2; } } for (@top) { say "$_->[0] : $_->[1]"; } say "adds: $cnt_add, replacements: $cnt_replace, lines: $.";

A sample run:

$ perl 919076.pl 1500 ID_0762 : 897.90 ID_1218 : 898.96 ID_1468 : 917.94 ID_0195 : 920.68 ID_0089 : 921.92 ID_0071 : 925.55 ID_0668 : 933.69 ID_1425 : 940.34 ID_0374 : 962.16 ID_0185 : 984.86 adds: 10, replacements: 45, lines: 1500

Note: Yes, (a) Premature optimization is the root of all evil, (b) I'm not claiming it's faster, as I haven't benchmarked it, and (c) I'm not claiming that yours is "too slow", either. I'm was just scratching an itch. ;^)

...roboticus

When your only tool is a hammer, all problems look like your thumb.


In reply to Re^2: Dealing with Hash Table by roboticus
in thread Dealing with Hash Table by &#350;uRvīv&#337;r

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.