in reply to Faster creation of Arrays in XS?

Other ideas to try.

Just push the pairs onto the stack and let the user assign that to a single level array:

my @diffs = diff( ... ); while( @diffs ) { my( $x, $y ) = ( pop @diffs, pop @diffs ); ## use em. }

Build a single level C array and return a SvPV that points to the head of that array:

U32 len = sizeof( U32 ) * some over estimate; U32 *diffs = malloc( len ); U32 i = 0; SV *packed; SvPV_Set( packed, diffs ); SvLEN_set( len ) for( ) { for( ) { diffs[ i++ ] = x; diffs[ i++ ] = y; { } SvCur_set( packed, ( i-1 ) * sizeof( U32 ) ); return packed; // sv_2mortal?

then use unpack to access the numbers:

my $diffs = diff( ... ); while( length( $diffs ) ) { my( $x, $y ) = unpack 'VV', substr( $diffs, 0, 8, '' ); ## use em. }

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

Replies are listed 'Best First'.
Re^2: Faster creation of Arrays in XS?
by wollmers (Scribe) on Jun 22, 2015 at 09:36 UTC

    BrowserUK: Unpack in your above code also creates a list, which seems to be the same bottleneck. Benchmarks are in the same range.

    Also tried List::Util::pairs(), which benchmarks a little faster. Maybe I can copy and inline this part of C code.

    What I maybe will do now is providing different formats, AoA [2][L], AoA[L][2] and 2 bitstrings (match-index). Bitstrings should be very fast, but not so convenient to process. Perl5 does not have the functions lsb (index of lowest significant bit) and msb, which Perl6 has.

      Unpack in your above code also creates a list,

      Only a list of 2 elements? That why I showed using substr to nibble the packed array in pairs.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

        Yes, 2 elements 50 times.

        #!perl use 5.006; # file: pack.t use strict; use warnings; use Benchmark qw(:all) ; use Data::Dumper; use List::Util qw(pairs); my @diffs = map { $_,$_; } (0..49); #print '@diffs: ',Dumper(\@diffs),"\n"; my $packed0 = pack('V*',@diffs); my $a = []; #print 'len: ',length( $packed0),"\n"; my $packed = $packed0; while (length( $packed)) { push @$a,[unpack ('VV', substr( $packed, 0, 8, ''))]; } #print Dumper($a); timethese( 50_000, { 'unpack while' => sub { my $packed = $packed0; while (length( $packed)) { my ($x,$y)= unpack ('VV', substr( $packed, 0, 8, '')); } }, 'unpack while single' => sub { my $packed = $packed0; while (length( $packed)) { my $x = unpack ('V', substr( $packed, 0, 4, '')); } }, 'unpack while push' => sub { my $packed = $packed0; $a = []; while (length( $packed)) { push @$a,[unpack ('VV', substr( $packed, 0, 8, ''))]; } }, 'unpack for' => sub { for ( my $i = 0;$i < length( $packed0)-1; $i += 8 ) { my ($x,$y)= unpack ('VV', substr( $packed0, $i, 8)); } }, 'unpack for push' => sub { $a = []; for ( my $i = 0;$i < length( $packed0)-1; $i += 8 ) { push @$a,[unpack ('VV', substr( $packed0, $i, 8))]; } }, }); ######## $ perl pack.t Benchmark: timing 50000 iterations of unpack for, unpack for push, unp +ack while, unpack while push, unpack while single... unpack for: 1 wallclock secs ( 1.01 usr + 0.00 sys = 1.01 CPU) @ 49 +504.95/s (n=50000) unpack for push: 2 wallclock secs ( 1.84 usr + 0.00 sys = 1.84 CPU) + @ 27173.91/s (n=50000) unpack while: 1 wallclock secs ( 0.84 usr + 0.00 sys = 0.84 CPU) @ +59523.81/s (n=50000) unpack while push: 2 wallclock secs ( 1.75 usr + 0.00 sys = 1.75 CP +U) @ 28571.43/s (n=50000) unpack while single: 1 wallclock secs ( 1.28 usr + 0.00 sys = 1.28 +CPU) @ 39062.50/s (n=50000)

        With bitmaps it would be an array of 2 scalars (2 x 64-bit IVs, 53 bits used in the original test case).