http://qs1969.pair.com?node_id=1216803


in reply to Re^7: Inline::C on Windows: how to improve performance of compiled code?
in thread Inline::C on Windows: how to improve performance of compiled code?

(So, BrowserUk, it looks like this stub wasn't optimized away.)

Hm. If you look above, you'll see that the 'call' from the XS wrapper to void test( SV *sv ) { ++i; } gets inlined to just 1 instruction:

67 ; 31 : test(sv); 68 69 inc DWORD PTR i

However, defining PERL_NO_GET_CONTEXT doesn't change a thing in the generated assembler. Of course, that is pre-optimisation code, so your timings may be a better indicator.

That said, I think you would be better off looking at ways to try and move some or all of your loop into C, rather than trying to optimise the calls from Perl to C.

What I mean is, if you are calling from Perl -> C 10e8 times, then your Perl code must consist of one or more loops. Whilst there is obviously some savings to be had by minimising the perl -> C -> perl transitions, there is (probably) a much larger saving to be had by moving the loop into C and avoiding all/or a large number of those transitions.

As an extreme example, the deBruijn sequence generator I recently ported from Python to Perl takes 1587 seconds to generate the de Bruijn sequence for 8-char substrings from a 10-char alphabet; but when ported to C, that drops to 0.57 seconds ( a 99.96% reduction!):

C:\test>DeBruijnX -N=8 -ALPHA=0123456789 Took: 1586.944328 secs 100000000 Took: 0.579065 secs 100000000

And a very large part of that massive saving is avoiding the perl function call overhead of the 16 million recursive function calls involved:

#! perl -slw use strict; # use Config; print $Config{ ccflags }; use Inline C => Config => BUILD_NOISY => 1; #, CCFLAGS => $Config{ ccf +lags } . "/link /FAs"; use Inline C => <<'END_C', NAME => '_deBruijn', CLEAN_AFTER_BUILD =>0 +; #define PERL_NO_GET_CONTEXT 1 int n, iseq; STRLEN k; char *seq, *a; void dbc( int t, int p ) { int i; if( t > n ) { if( n % p == 0 ) for( i = 1; i <= p; ++i ) seq[ iseq++ ] = a[ i ]; } else { a[ t ] = a[ t - p ]; dbc( t+1, p ); for( i = a[ (t - p) ] + 1; i < k; ++i ) { a[ t ] = i; dbc( t+1, t ); } } } SV *deBruijnC( SV *svAlphabet, SV *len ) { int i; char *alphabet = SvPV( svAlphabet, k ); n = (int)SvIV( len ); iseq = 0; Newxz( seq, (int)pow( (double)k, (double)n), char ); Newxz( a, k * n, char ); dbc( 1, 1 ); for( i = 0; i < iseq ; ++i ) { seq[ i ] = alphabet[ seq[ i ] ]; } return newSVpv( seq, iseq ); } END_C

Defining PERL_NO_GET_CONTEXT doesn't stop it from running, but it doesn't improve performance one iota.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit