Re^5: Bit vector fiddling with Inline C

I don't suppose your knowledge of the internals can confirm that the example in the OP is indeed just passing a pointer to $vector, and NOT copying the entire byte sequence somewhere else at the same time (even just as a side-effect)?

Yes. I can confirm that. No copying is done.

When you define an XS argument as SV* sv_vec, you are asking for a pointer to the SV. When you operate via that pointer, you are changing the original SV.

As with ordinary perl subs, the subroutines receives aliases to the actual variables passed:

sub x{ 
    ++$_  for @_ 
};;

( $a, $b, $c) = 12345..12347;;
x( $a, $b, $c );;

print $a, $b, $c;;
12346 12347 12348
[download]

No copying occurs unless the programmer assigns them to local vars:

sub x{ 
    my( $a, $b, $c ) = @_;
    ++$_  for $a, $b, $c;
}
[download]

If I am defining perl subs to operate upon large scalars, and they are more complex than a couple of lines--at which point the $_[0], $_[1] nomenclature can become awkward--then I will use scalar refs:

sub xyz (\$) {
    my $rStr = shift;
    substr $$rStr, ...;
    vec $$rStr, ...;
    ...
}
[download]

Which achieves the benefit of named variables without the cost of copying.

BTW: You still haven't mentioned what the "complex processing" you are performing in XS is?

I ask because my instinctual reaction that if you are performing boolean operations on whole pairs or more of large bit vectors, it is almost certainly quicker doing it in Perl.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^5: Bit vector fiddling with Inline C Select or Download Code

Replies are listed 'Best First'.
Re^6: Bit vector fiddling with Inline C by oxone (Friar) on May 09, 2011 at 21:54 UTC
Thanks! Re this point >>BTW: You still haven't mentioned what the "complex processing" you are performing in XS is?<< -- sorry, not trying to be mysterious, was just trying to keep the question focused! One real example is this: there are 2 bit vectors of different sizes (each in the range of 1-8m bits) with irregular, many-to-many relationships between the bits in each vector. (The relationships are represented separately by a long array of pairs of ints.) One real function is to find every set bit in vector A, then set the corresponding bit(s) in vector B. So: simple bitwise ops are out of the question because the vectors are of different sizes. I coded it first in pure Perl using the vec() function, but it was pretty slow (one call to vec() to test/set each bit). So, I switched to Inline::C for the heavy lifting and it's now over 20x faster (ie. the entire test/set loop inside one C function).	[reply]
Re^7: Bit vector fiddling with Inline C by syphilis (Archbishop) on May 10, 2011 at 02:12 UTC
One real function is to find every set bit in vector A, then set the corresponding bit(s) in vector B I deduce, therefore, that A is smaller than (or equal to) B. I would expect you would observe a further siginificant increase in speed if you used the gmp library for that function - either accessing that library via Inline::C, or using one of the existing perl extensions (Math::GMP, Math::GMPz). Using the gmp library in C, the code to perform the above task would be something like: `size = mpz_sizeinbase(A, 2); for(i = 0; i < size; i++) { if(mpz_tstbit(A, i)) mpz_setbit(B,i); }` [download] (and the same approach when using the aforementioned perl modules.) Of course, you might consider the involvement of the gmp library to be cheating - depending upon the extent to which you want to handle things yourself :-) Cheers, Rob	[reply] [d/l]
Re^8: Bit vector fiddling with Inline C by BrowserUk (Patriarch) on May 10, 2011 at 07:06 UTC
Two questions: Why test the bit and then set it? If you are only going to set the bit if it is unset, then just setting it achieves the same thing. Why would calling a function to set a bit be quicker than setting the bit directly? I guess the compiler might inline the function and optimise away the overheads, but still I don't see why it would end up being any quicker. As quick possibly. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^9: Bit vector fiddling with Inline C by syphilis (Archbishop) on May 10, 2011 at 23:45 UTC
Re^10: Bit vector fiddling with Inline C by BrowserUk (Patriarch) on May 11, 2011 at 01:01 UTC
Some notes below your chosen depth have not been shown here
Re^9: Bit vector fiddling with Inline C by oxone (Friar) on May 10, 2011 at 08:11 UTC
Re^10: Bit vector fiddling with Inline C by BrowserUk (Patriarch) on May 10, 2011 at 14:34 UTC
Re^8: Bit vector fiddling with Inline C by oxone (Friar) on May 10, 2011 at 05:56 UTC
Great tip, thanks. Wasn't aware of that, but will certainly check it out. Currently getting/setting bits byte-wise as in the OP example, which I doubt is the best way. Will definitely test this out as an alternative.	[reply]