I read once that the $a,$b trick was done for performance reasons. I know the optimimum use of sort is for perl to call predefined routines. However, sort achieves reasonable performance even for nonstandard user-defined functions. Here is what I tried, which seg-faulted. I need more in-depth perlguts knowledge to make it work. Probably I should give up and write the individual sorts.
#!/usr/bin/perl use strict; use warnings; use Algorithm::Numerical::Shuffle qw( shuffle ); use Inline C => 'DATA'; my $max = 10; my @values = 1 .. $max; my @values_mixed = shuffle(@values); my @top = topNbs(5, sub { $a <=> $b }, \@values_mixed); print join(" ",sort { $a <=> $b } @top),"\n"; __END__ __C__ int callComp( SV* cmp, GV *first, GV *second, SV *a, SV *b) { int count, rv; dSP; ENTER; SAVETMPS; PUSHMARK(SP); PUTBACK; GvSV(first) = a; GvSV(second) = b; if( ( count = call_sv( cmp, G_SCALAR|G_NOARGS) ) != 1 ) croak( "Bad comparator" ); SPAGAIN; rv = POPi; FREETMPS; LEAVE; } void topNbs( int n, SV *cmp, AV*data) { SV **topN; GV *first, *second; int len = av_len( data ); int i, j, k; int left, right; Inline_Stack_Vars; Newz( 1, topN, n + 1, SV* ); for (i = 0; i < n+1; i++) { topN[i] = newSViv( 0 ); } first = gv_fetchpv("a", TRUE, SVt_PV); second = gv_fetchpv("b", TRUE, SVt_PV); for( i = 0; i <= len; i++ ) { SV *val = av_fetch( data, i, 0 ); int comp = callComp(cmp, first, second, val, topN[ n - 1]); if (comp <= 0) continue; left = 0; right = n - 1; while (left < right) { int middle = (left + right) >> 1; if (callComp(cmp, first, second, val, topN[ n - 1]) <= 0) { left = middle + 1; } else { right = middle; } } if( callComp(cmp, first, second, topN[ left ], val) < 0) { for( k = n; k > left; k-- ) topN[ k ] = topN[ k-1 ]; topN[ left ] = val; } } Inline_Stack_Reset; for( i = 0; i < n; i++ ) Inline_Stack_Push( sv_2mortal( newSVsv( topN[ i ] ) ) ); Safefree( topN ); Inline_Stack_Done; }
Update: I must have been thinking in C++. I have moved the GV* declarations up. It still doesn't do what I want, (now it tells me the '<=>' operations are being applied to undefined values sometimes, then seg faults).

Here is some sample output:

Use of uninitialized value in numeric comparison (<=>) at top2.pl line + 11. Use of uninitialized value in numeric comparison (<=>) at top2.pl line + 11. Use of uninitialized value in numeric comparison (<=>) at top2.pl line + 11. Use of uninitialized value in numeric comparison (<=>) at top2.pl line + 11. Use of uninitialized value in numeric comparison (<=>) at top2.pl line + 11. 0 0 0 0 0 Attempt to free unreferenced scalar.

In reply to Re^10: Better mousetrap (getting top N values from list X) by tall_man
in thread Better mousetrap (getting top N values from list X) by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.