Re^4: Using $a and $b from XS

Replies are listed 'Best First'.
Re^5: Using $a and $b from XS by tall_man (Parson) on Feb 07, 2005 at 01:28 UTC
Maybe some time could be saved by looking up the symbols once and then filling in the aliases each time through the loop. That's what they do in the sort driver code. (I'm not an XS expert either. I'm still hoping an expert will read this thread and give us some other ideas.)	[reply]
Re^6: Using $a and $b from XS by BrowserUk (Patriarch) on Feb 07, 2005 at 02:22 UTC
Maybe some time could be saved by looking up the symbols once and then filling in the aliases each time through the loop That could definitely be done. By looking them up once at the top of the main routine and then reverting to your original scheme of passing them into the comparator sub each time (almost as constants) in addition to values to set into them would work, but just pushing and popping them as constants each comparison is going to have a hit. So then you look at assigning the values into them inside the main routine, before passing them through. That's good, but it requires cut&paste reuse, complicates the calling of the comparator and makes the code messy. That could probably be alleviated by the use of a couple of macros to assign the values to the globals inline to the call to the comparator, but when I tried it, the compiler generated streams of errors. It necessitates calling macros within macros, including using some of those macros as lvalues of values returned by other macros. And worse, I don't really understand how some of those macros--which are themselves defined in terms of yet more macros--work! (Does anyone?) Therefore, I am not able to work out why what I was trying to do went bellyup. Normally, if I get "macro expansion" problems, I just ask the compiler to output the preprocessor output so I can see what the actual effect is, but just trying to track down which damn source file is being passed to the compiler is the first problem. Trying to call it with the appropriate environment (compiler flags--libraries etc) outside of the MakeMaker nest-of-vipers is impossible. So then I tried passing the /P flag via the Inline C options, which works, but as every source file includes every damn header in the whole crumbling edifice, it produces a file of 3/4 of a megabyte and ONE HUNDRED AND THIRTEEN THOUSAND NINE HUNDRED AND FOURTY EIGHT LINES! And what lines. Here is what your nice elegant topNbs() function looks like by the time the C compiler comes to convert it to machine code: void XS_main_topNbs(register PerlInterpreter my_perl , CV cv); void XS_main_topNbs(register PerlInterpreter my_perl , CV cv) { register SV *sp = (Perl_Tstack_sp_ptr(((PerlInterpreter )pthrea +d_getspecific((Perl_Gthr_key_ptr(0)))))); register SV *mark = (Per +l_Tstack_base_ptr(((PerlInterpreter )pthread_getspecific((Perl_Gthr +_key_ptr(0)))))) + ((Perl_Tmarkstack_ptr_ptr(((PerlInterpreter )pt +hread_getspecific((Perl_Gthr_key_ptr(0))))))--); I32 ax = mark - (P +erl_Tstack_base_ptr(((PerlInterpreter )pthread_getspecific((Perl_Gt +hr_key_ptr(0)))))) + 1; I32 items = sp - mark; if (items != 3) Perl_croak(((PerlInterpreter )pthread_getspecific((Perl_Gthr_key +_ptr(0)))), "Usage: main::topNbs(n, cmp, data)"); sp -= items; { int n = (int)((((Perl_Tstack_base_ptr(((PerlInterpreter )pthr +ead_getspecific((Perl_Gthr_key_ptr(0))))))[ax + (0)])->sv_flags & 0x +00010000) ? ((XPVIV) ((Perl_Tstack_base_ptr(((PerlInterpreter )pth +read_getspecific((Perl_Gthr_key_ptr(0))))))[ax + (0)])->sv_any)->xiv +_iv : Perl_sv_2iv(((PerlInterpreter )pthread_getspecific((Perl_Gthr +_key_ptr(0)))), (Perl_Tstack_base_ptr(((PerlInterpreter )pthread_ge +tspecific((Perl_Gthr_key_ptr(0))))))[ax + (0)])); SV cmp = (Perl_Tstack_base_ptr(((PerlInterpreter )pthread_g +etspecific((Perl_Gthr_key_ptr(0))))))[ax + (1)]; AV data; #line 258 "topN_pl_c7f1.xs" I32* temp; #line 314 "topN_pl_c7f1.c" if ((((Perl_Tstack_base_ptr(((PerlInterpreter )pthread_getspecif +ic((Perl_Gthr_key_ptr(0))))))[ax + (2)])->sv_flags & 0x00080000) && +((((XRV) ((Perl_Tstack_base_ptr(((PerlInterpreter )pthread_getspec +ific((Perl_Gthr_key_ptr(0))))))[ax + (2)])->sv_any)->xrv_rv)->sv_fla +gs & 0xff)==SVt_PVAV) data = (AV)((XRV) ((Perl_Tstack_base_ptr(((PerlInterpreter +)pthread_getspecific((Perl_Gthr_key_ptr(0))))))[ax + (2)])->sv_any) +->xrv_rv; else Perl_croak(((PerlInterpreter )pthread_getspecific((Perl_Gthr +_key_ptr(0)))), "data is not an array reference"); #line 260 "topN_pl_c7f1.xs" temp = (Perl_Tmarkstack_ptr_ptr(((PerlInterpreter )pthread_getsp +ecific((Perl_Gthr_key_ptr(0))))))++; topNbs(n, cmp, data); if ((Perl_Tmarkstack_ptr_ptr(((PerlInterpreter )pthread_getspeci +fic((Perl_Gthr_key_ptr(0)))))) != temp) { (Perl_Tmarkstack_ptr_ptr(((PerlInterpreter )pthread_getspecifi +c((Perl_Gthr_key_ptr(0)))))) = temp; do { do { IV tmpXSoff = (0); (Perl_Tstack_sp_ptr(((PerlInterpre +ter )pthread_getspecific((Perl_Gthr_key_ptr(0)))))) = (Perl_Tstack +_base_ptr(((PerlInterpreter )pthread_getspecific((Perl_Gthr_key_ptr +(0)))))) + ax + (tmpXSoff - 1); return; } while (0); } while (0); } return; #line 330 "topN_pl_c7f1.c" (Perl_Tstack_sp_ptr(((PerlInterpreter )pthread_getspecific((Per +l_Gthr_key_ptr(0)))))) = sp; return; } } [download] Now whilst the compiler may be able to make sense of it in that form, my eyes refuse to retain focus long enough to read from one end of a single line to the other. I did try to reformat that to my usual coding standard, but--even standalone, away from the other 114, 000 lines--it caused my usually reliable reformatting macros to segfault my editor! And do you see those oh so helpful line numbers embedded in compiler directives? Apart from the fact that they relate to intermediate files--some of which appear to get deleted--not the original source code--they are also, to the best of my ability to correlate them, WRONG! So how does anyone debug this stuff? Symbolic debuggers (do they help)? Embedded break points? Reverse binary osmosis? Pressing their ear to the cpu and listening to the frequency of the hum? I'm still hoping an expert will read this thread and give us some other ideas. Hmmm. Maybe... Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l]
Re^6: Using $a and $b from XS by BrowserUk (Patriarch) on Feb 07, 2005 at 11:38 UTC
Here is my final incarnation of the comparator caller "function" plus a version of topNbs() that uses it. By re-writing the PUSHMARK() macro, I managed to squeeze the calling of the via-$A$B comparator into another macro and avoid (the little) overhead that imposed. I've also done away with the separate `SV** topN;` and associated memory allocation and freeing by building the return list directly on the stack. If your going to adapt your topNheap(), note the order and positioning of the large group of XS macros--they are critical. #define MYPUSHMARK(p) \ ( PL_markstack_ptr = ( ++PL_markstack_ptr == PL_markstack_max ) \ ? ( markstack_grow(), (p) - PL_stack_base ) \ : ( (p) - PL_stack_base ) ) #define CMP( callback, aa, bb ) ( \ ( GvSV( a ) = ( aa ) ), \ ( GvSV( b ) = ( bb ) ), \ ( MYPUSHMARK(SP) ), \ ( call_sv( callback, G_SCALAR\|G_NOARGS ) ), \ ( SPAGAIN ), \ ( POPi ) \ ) void topNbsAB( int n, SV cmp, AVdata ) { int i, k; int left, right; int len = av_len( data ); GV a = gv_fetchpv( "main::a", TRUE, SVt_PV ); GV* b = gv_fetchpv( "main::b", TRUE, SVt_PV ); dSP; dMARK; dAX; POPMARK; POPs; POPs; POPs; EXTEND( SP, n ); PUSHMARK( SP ); for( i = 0; i < n; i++ ) { PUSHs( newSViv( 0 ) ); } PUTBACK; for( i = 0; i <= len; i++ ) { SV val = av_fetch( data, i, 0 ); if( CMP( cmp, val, ST( n-1 ) ) < 0 ) continue; left = 0; right = n; while (left < right) { int middle = ( left + right ) >> 1; if( CMP( cmp, val, ST( middle ) ) <= 0 ) { left = middle + 1; } else { right = middle; } } for( k = n; k >= left; k-- ) ST( k ) = ST( k - 1 ); ST( left ) = val; } XSRETURN( n ); } [download] The upshot in my tests is that whilst using $a and $b is quicker than passing via the stack for small numbers of callbacks, as that number grows, the extra cost of lookup of globals (in the comparator sub itself) overtakes the saving of stacking the parameters in the C code. Once again, when you move above about 1% N of T, sort starts to win by not having to call the comparator at all. I'd be really interested to see what a real XS programmer would make of the above? Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l] [select]
Re^7: Using $a and $b from XS by tall_man (Parson) on Feb 07, 2005 at 20:44 UTC
This last example does not work for me. For example, when I try: `my $max = 10; my @values = 1 .. $max; my @values_mixed = shuffle(@values); my @top = topNbsAB(5, sub { $a <=> $b }, \@values_mixed); print join(" ",sort { $a <=> $b } @top),"\n";` [download] I get a result like: `1 1 1 1 9` [download] I'm also working out how to avoid the extra creation of zero-fill items. Ideally, I would replace them with the first n items of the input in reverse-sorted order. There are two reasons for this: 1) We cannot yet handle negative numbers. 2) We should be testing for the case that the input length is less than or equal to n, and in that case returning everything they gave us. Something like: `// Special case -- too few values provided. Just return them. if (len+1 <= n) { for( i = 0; i < len; i++ ) { SV val = av_fetch( data, i, 0 ); PUSHs( newSVsv( val ) ); } PUTBACK; XSRETURN(len+1); }` [download]	[reply] [d/l] [select]
Re^8: Using $a and $b from XS by BrowserUk (Patriarch) on Feb 08, 2005 at 01:52 UTC
Re^9: Using $a and $b from XS by tall_man (Parson) on Feb 08, 2005 at 03:10 UTC
Some notes below your chosen depth have not been shown here
Re^6: Using $a and $b from XS by BrowserUk (Patriarch) on Feb 07, 2005 at 05:34 UTC
Gah! I guess I'm hooked :\| Pushing things along a little further, I succeeded in coding a macro (that doesn't break), to take care of the assignments to $a & $b and then call the comparator, reducing some of the overhead in the process: Read more... (2 kB) This appears to be reliable and gives it another step up over sort for smallish lists from biggish ones: P:\test>topn -MAX=10000 -N=10 1 trial of sort [10 of 10000] ( 17.186ms total), 17.186/trial 1 trial of topNbsAB [10 of 10000] ( 10.843ms total), 10.843/trial 1 trial of topNbs@ [10 of 10000] ( 17.276ms total), 17.276/trial P:\test>topn -MAX=10000 -N=100 1 trial of sort [100 of 10000] ( 16.568ms total), 16.568/trial 1 trial of topNbsAB [100 of 10000] ( 15.485ms total), 15.485/trial 1 trial of topNbs@ [100 of 10000] ( 22.263ms total), 22.263/trial P:\test>topn -MAX=100000 -N=100 1 trial of sort [100 of 100000] (231.679ms total), 231.67/trial 1 trial of topNbsAB [100 of 100000] (143.079ms total), 143.07/trial 1 trial of topNbs@ [100 of 100000] (187.500ms total), 187.50/trial [download] However, as soon as size of N approaches 10% of the main list, sort once again takes a lead: `P:\test>topn -MAX=100000 -N=1000 [SNIP] 1 trial of sort [1000 of 100000] (231.573ms total), 231.57/tria +l 1 trial of topNbsAB [1000 of 100000] (298.991ms total), 298.99/tria +l 1 trial of topNbs@ [1000 of 100000] (296.875ms total), 296.87/tria +l` [download] And if the list gets really large and the N very small (which ought to favour the linear search + binary sort over full sort and slice), sort once again wins hands down. More interestingly, topNbs() (via @_) starts to show better again? `1 trial of sort [10000 of 1000000] ( 3.156s total), 3.156s/tr +ial 1 trial of topNbsAB [10000 of 1000000] ( 13.641s total), 13.641/tr +ial 1 trial of topNbs@ [10000 of 1000000] ( 4.734s total), 4.734s/tr +ial` [download] I think this is due to the allocation, initialisation of the SV* topN array, followed by the incremental extension of the perl stack through XPUSHs as we copy the topN array onto the stack. Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l] [select]