in reply to Re^4: Bit vector fiddling with Inline C
in thread Bit vector fiddling with Inline C

I don't suppose your knowledge of the internals can confirm that the example in the OP is indeed just passing a pointer to $vector, and NOT copying the entire byte sequence somewhere else at the same time (even just as a side-effect)?

Yes. I can confirm that. No copying is done.

When you define an XS argument as SV* sv_vec, you are asking for a pointer to the SV. When you operate via that pointer, you are changing the original SV.

As with ordinary perl subs, the subroutines receives aliases to the actual variables passed:

sub x{ ++$_ for @_ };; ( $a, $b, $c) = 12345..12347;; x( $a, $b, $c );; print $a, $b, $c;; 12346 12347 12348

No copying occurs unless the programmer assigns them to local vars:

sub x{ my( $a, $b, $c ) = @_; ++$_ for $a, $b, $c; }

If I am defining perl subs to operate upon large scalars, and they are more complex than a couple of lines--at which point the $_[0], $_[1] nomenclature can become awkward--then I will use scalar refs:

sub xyz (\$) { my $rStr = shift; substr $$rStr, ...; vec $$rStr, ...; ... }

Which achieves the benefit of named variables without the cost of copying.

BTW: You still haven't mentioned what the "complex processing" you are performing in XS is?

I ask because my instinctual reaction that if you are performing boolean operations on whole pairs or more of large bit vectors, it is almost certainly quicker doing it in Perl.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^6: Bit vector fiddling with Inline C
by oxone (Friar) on May 09, 2011 at 21:54 UTC

    Thanks! Re this point >>BTW: You still haven't mentioned what the "complex processing" you are performing in XS is?<< -- sorry, not trying to be mysterious, was just trying to keep the question focused!

    One real example is this: there are 2 bit vectors of different sizes (each in the range of 1-8m bits) with irregular, many-to-many relationships between the bits in each vector. (The relationships are represented separately by a long array of pairs of ints.) One real function is to find every set bit in vector A, then set the corresponding bit(s) in vector B.

    So: simple bitwise ops are out of the question because the vectors are of different sizes. I coded it first in pure Perl using the vec() function, but it was pretty slow (one call to vec() to test/set each bit). So, I switched to Inline::C for the heavy lifting and it's now over 20x faster (ie. the entire test/set loop inside one C function).

      One real function is to find every set bit in vector A, then set the corresponding bit(s) in vector B

      I deduce, therefore, that A is smaller than (or equal to) B.

      I would expect you would observe a further siginificant increase in speed if you used the gmp library for that function - either accessing that library via Inline::C, or using one of the existing perl extensions (Math::GMP, Math::GMPz). Using the gmp library in C, the code to perform the above task would be something like:
      size = mpz_sizeinbase(A, 2); for(i = 0; i < size; i++) { if(mpz_tstbit(A, i)) mpz_setbit(B,i); }
      (and the same approach when using the aforementioned perl modules.)

      Of course, you might consider the involvement of the gmp library to be cheating - depending upon the extent to which you want to handle things yourself :-)

      Cheers,
      Rob

        Two questions:

        1. Why test the bit and then set it?

          If you are only going to set the bit if it is unset, then just setting it achieves the same thing.

        2. Why would calling a function to set a bit be quicker than setting the bit directly?

          I guess the compiler might inline the function and optimise away the overheads, but still I don't see why it would end up being any quicker. As quick possibly.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Great tip, thanks. Wasn't aware of that, but will certainly check it out. Currently getting/setting bits byte-wise as in the OP example, which I doubt is the best way. Will definitely test this out as an alternative.