comment on

Gah! I guess I'm hooked :|

Pushing things along a little further, I succeeded in coding a macro (that doesn't break), to take care of the assignments to $a & $b and then call the comparator, reducing some of the overhead in the process:

#define CMP( callback, aa, bb ) \
( ( GvSV( a ) = ( aa ) ), ( GvSV( b ) = ( bb ) ), CmpAB( callback ) )

int CmpAB( SV* cmp ) {
    int rv;
    dSP;

    PUSHMARK(SP);

    if( call_sv( cmp, G_SCALAR|G_NOARGS ) != 1 )
        croak( "Bad comparator" );
    SPAGAIN;

    rv = POPi;

    PUTBACK; 
    return rv;
}

void topNbsAB( int n, SV *cmp, AV*data ) {
    SV* *topN;
    int len = av_len( data );
    int i, k;
    int left, right;
    SV* a = gv_fetchpv( "main::a", TRUE, SVt_PV );
    SV* b = gv_fetchpv( "main::b", TRUE, SVt_PV );
    Inline_Stack_Vars;
    Newz( 1, topN, n + 1, SV* );

    for( i = 0; i < n+1; i++ ) {
       topN[ i ] = newSViv( 0 );
    }

    for( i = 0; i <= len; i++ ) {
        SV *val = *av_fetch( data, i, 0 );
        if( CMP( cmp, val, topN[ n - 1] ) < 0 ) continue;

        left = 0;
        right = n - 1;
        while (left < right) {
           int middle = ( left + right ) >> 1;
           if( CMP( cmp, val, topN[ middle ] ) <= 0 ) {
              left = middle + 1;
           } else {
              right = middle;
           }
        }
        for( k = n; k > left; k-- ) topN[ k ] = topN[ k-1 ];
        topN[ left ] = val;
    }

    Inline_Stack_Reset;
    for( i = 0; i < n; i++ )
        Inline_Stack_Push( sv_2mortal( newSVsv( topN[ i ] ) ) );
    Safefree( topN );
    Inline_Stack_Done;
}
[download]

This appears to be reliable and gives it another step up over sort for smallish lists from biggish ones:

P:\test>topn -MAX=10000 -N=10
  1 trial  of     sort [10 of 10000] ( 17.186ms total), 17.186/trial

  1 trial  of topNbsAB [10 of 10000] ( 10.843ms total), 10.843/trial

  1 trial  of  topNbs@ [10 of 10000] ( 17.276ms total), 17.276/trial


P:\test>topn -MAX=10000 -N=100
  1 trial  of     sort [100 of 10000] ( 16.568ms total), 16.568/trial

  1 trial  of topNbsAB [100 of 10000] ( 15.485ms total), 15.485/trial

  1 trial  of  topNbs@ [100 of 10000] ( 22.263ms total), 22.263/trial

P:\test>topn -MAX=100000 -N=100
  1 trial  of     sort [100 of 100000] (231.679ms total), 231.67/trial

  1 trial  of topNbsAB [100 of 100000] (143.079ms total), 143.07/trial

  1 trial  of  topNbs@ [100 of 100000] (187.500ms total), 187.50/trial
[download]

However, as soon as size of N approaches 10% of the main list, sort once again takes a lead:

P:\test>topn -MAX=100000 -N=1000
[SNIP]
  1 trial  of     sort [1000 of 100000] (231.573ms total), 231.57/tria
+l

  1 trial  of topNbsAB [1000 of 100000] (298.991ms total), 298.99/tria
+l

  1 trial  of  topNbs@ [1000 of 100000] (296.875ms total), 296.87/tria
+l
[download]

And if the list gets really large and the N very small (which ought to favour the linear search + binary sort over full sort and slice), sort once again wins hands down. More interestingly, topNbs() (via @_) starts to show better again?

  1 trial  of     sort [10000 of 1000000] (   3.156s total), 3.156s/tr
+ial

  1 trial  of topNbsAB [10000 of 1000000] (  13.641s total), 13.641/tr
+ial

  1 trial  of  topNbs@ [10000 of 1000000] (   4.734s total), 4.734s/tr
+ial
[download]

I think this is due to the allocation, initialisation of the SV* topN array, followed by the incremental extension of the perl stack through XPUSHs as we copy the topN array onto the stack.

Examine what is said, not who speaks.

Silence betokens consent.

Love the truth but pardon error.

In reply to Re^6: Using $a and $b from XS by BrowserUk
in thread Using $a and $b from XS by tall_man

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.