The code would need more work before it could go into List::Util

Agreed, but I did have a go at making topN() accept a comparator function, but the cost off calling back into Perl code for each comparison exacts a heavy performance penalty--to the point where it looses all benefit. Though it is possible that my fledgling XS/internals skills bore some of that cost.

The only way I see to avoid that, would be to have four separate routines, (say) highNn(); highNs(); lowNn(); lowNs();.

I think this is the route taken by sort, albeit that the "naming convention" (for backwards compatibiity?) is sort{$a<=>$b}(); thru sort{$bcmp$a}();

Since the basic algorithm needs only a size-N array, there ought to be some way to wrap this as a package that takes an arbitrary input set of unlimited size,...

Personally, I'm not very keen on that type of pseudo-OO wrapper around what is essentially a functional algorithm. Would the wrapper code accumulate a copy of everything passed or discard values that didn't make the cut?

If the former, unnecessary duplication of storage results. If the latter, I think I would prefer not to do a function call to reject individual values that don't make the cut. I'd code the accumlator scenario something like:

my @topN = topN( $n, \@data ); @topN = topN( $n, [ @topN, grep{ chomp; $_ > $topN[ -1 ] } <> ] );

(BTW. I really wish that chomp returned the chomped value. It would make some constructs so much easier.)

or maybe

my @topN( $n, \@data ); my @batch; while( not eof ARGV ) { for( my $i=0; $_ = <> and $i < $BATCHSIZE; $i++ ) { chomp; push @batch, $_; } @topN( $n, [ @topN, @batch ] ); }

Basically, I prefer to retain programmer choice in stuff like this--but there would be nothing to stop someone adding a wrapper around it, though four (type/direction) versions would probably be needed again.


Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

In reply to Re^5: Better mousetrap (getting top N values from list X) by BrowserUk
in thread Better mousetrap (getting top N values from list X) by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.