comment on

I had a reference to an array the other day, that looked something like this:

my $r = [ 1, 2, 3, 4, 5, 6 ];

the problem being that I wanted to restructure the values, by taking them by pairs, hence:

my $r = [ [1, 2], [3, 4], [5, 6] ];

Looks easy enough, and certainly in Perl 6 it will be a snap, since you can iterate over lists taking n at a time (notwithstanding any glaring syntax errors due to my ignorance):

  for @$r -> $x, $y {
    push @$pairs, [$x, $y];
  }
[download]

But in Perl 5 things are not quite so simple. I'm pretty sure this topic has been covered before, but after searching for a while on obvious things (qw/array element processing time/), all I found was Processing Multiple Array Elements and Breaking output lines into N-element chunks but I'm sure there's another one out there somewhere. So I turned to the chatterbox, and with a little help from my friends™ I managed to get it figured out, and so I thought I'd preserve the solutions here for posterity.

My first idea (no giggling please) was to either push a new anonymous array onto the end of the array or push an element to the last array of the array, which gives:

my $r2;
my $flipper = 0;
for( @$r ) {
  if( $flipper++ % 2 ) {
    push @{$r2->[$#{$r2}]}, $_;
  }
  else {
    push @$r2, [ $_ ];
  }
}
[download]

It looks pretty silly, but it doesn't destroy the contents of $r, and it can scale beyond more than 2 elements at a time (by saying $flipper++ % 2 >. At this point other people chimed in with different suggestions.

domm suggested treating the array as a hash and iterating over keys and values:

my %h=(@$r);
for (keys %h) {
  push @$r2, [$_, $h2{$_} ]
}
# or possibly
# push @$r2, [$_, $h2{$_} ] for keys %h
[download]

But that has a number of problems

it doesn't preserve the order of the array;
all array elements that become keys must be unique;
the keys are stringified;
it doesn't scale beyond 2 elements at a time.

BrowserUk suggested the following map:

my $i = 0;
my $r2 = map {[ $r->[$i*2], $r->[$i++*2+1] ]}
         (0 .. $#{$r}/2 - 1);
    # or (0 .. scalar(@$r)/2 - 1)
[download]

It does the job, but there's too much happening for it to be clear what is going on, specifically the limit-1 and postincrement. Plus it's really hard to format the whitespace. This was later amended to:

$r2 = [ map{ [ $r->[$_*2], $r->[$_*2+1] ] } (0 .. $#{$r}/2) ]

which gets rid of the icky index variable, and then for an encore rearranged the code to use array slices instead: $r2->[$_] = [ @$r[$_*2, $_*2+1] ] for (0..$#{$r} / 2);

sauoq suggested a good olde C-like approach:

for( my $i=0; $i < scalar @$r; $i += 2 ) {
  $r2->[$i/2] = [ @$r[$i, $i+1] ]
}
[download]

To a Perl beginner, it will be pretty obvious what's going on.

Someone else suggested (and I forget who (and that's maybe just as well)) doing the following:

my @copy = @$r;
for( @copy ) {
    push @$r2, [ $_, shift @copy ];
}
[download]

But using shift in a foreach is evil. Sometimes you can get away with it, but in this case the result is wrong. It's off-by-one in a bizarre way. And of course it destroys the source array.

But I think this prodded someone else into getting this approach done the right way, for they (again I forget who) suggested:

my @copy = @$r;
while( @copy ) {
    push @$r2, [ shift @c, shift @c ];
}
[download]

and in actual fact that is the code that I adopted (despite the fact that it is a destructive read), the main criterion being how easy is it for a non-Perl expert to understand. The code already slings references to anonymous structures all about, so this loop has a certain appealing clarity.

jmcnamara chimed in late with:

  $r2 = [ map { [$r->[2*$_], $r[2*$_+1]] } 0 .. scalar @$r / 2 - 1 ]
[download]

which could, in light of the other solutions, be amended to

  $r2 = [ map { [@$r[2*$_, 2*$_+1]] } 0 .. scalar @$r / 2 - 1 ]
[download]

In one of the threads I found, merlyn suggests using splice, which would give something like

@copy = @$r;
while (my @pair = splice @copy, 0, 2) {
  push @$r2, [@pair];
}
[download]

This particular variant has the property of having the simplest and clearest method of extending to an arbitrary number of tuples. Just increase the value of the splice's third parameter.

Well all that's well and good, but what would a TIMTOWTDI meditation be without a Benchmark

Array with 8 elements

Benchmark: timing 50000 iterations of aslice, c_loop, flipper,
 map_i, map_jm, map_jm2, map_pure, shifter, splicer...
    aslice: 11 wallclock secs ( 7.29 usr +  0.00 sys =  7.29 CPU) @ 6858.71/s (n=50000)
    c_loop: 10 wallclock secs ( 5.91 usr +  0.00 sys =  5.91 CPU) @ 8460.24/s (n=50000)
   flipper: 18 wallclock secs ( 9.79 usr +  0.00 sys =  9.79 CPU) @ 5107.25/s (n=50000)
     map_i:  9 wallclock secs ( 5.37 usr +  0.00 sys =  5.37 CPU) @ 9310.99/s (n=50000)
    map_jm:  9 wallclock secs ( 5.06 usr +  0.00 sys =  5.06 CPU) @ 9881.42/s (n=50000)
   map_jm2:  8 wallclock secs ( 5.00 usr +  0.00 sys =  5.00 CPU) @ 10000.00/s (n=50000)
  map_pure:  7 wallclock secs ( 5.30 usr +  0.00 sys =  5.30 CPU) @ 9433.96/s (n=50000)
   shifter:  6 wallclock secs ( 3.99 usr +  0.00 sys =  3.99 CPU) @ 12531.33/s (n=50000)
   splicer:  8 wallclock secs ( 5.91 usr +  0.00 sys =  5.91 CPU) @ 8460.24/s (n=50000)
            Rate flipper aslice c_loop splicer map_i map_pure map_jm map_jm2 shifter
flipper   5107/s      --   -26%   -40%    -40%  -45%     -46%   -48%    -49%    -59%
aslice    6859/s     34%     --   -19%    -19%  -26%     -27%   -31%    -31%    -45%
c_loop    8460/s     66%    23%     --     -0%   -9%     -10%   -14%    -15%    -32%
splicer   8460/s     66%    23%     0%      --   -9%     -10%   -14%    -15%    -32%
map_i     9311/s     82%    36%    10%     10%    --      -1%    -6%     -7%    -26%
map_pure  9434/s     85%    38%    12%     12%    1%       --    -5%     -6%    -25%
map_jm    9881/s     93%    44%    17%     17%    6%       5%     --     -1%    -21%
map_jm2  10000/s     96%    46%    18%     18%    7%       6%     1%      --    -20%
shifter  12531/s    145%    83%    48%     48%   35%      33%    27%     25%      --

Array with 1000 elements

Benchmark: timing 500 iterations of aslice, c_loop, flipper,
 map_i, map_jm, map_jm2, map_pure, shifter, splicer...
    aslice:  8 wallclock secs ( 6.26 usr +  0.00 sys =  6.26 CPU) @ 79.87/s (n=500)
    c_loop:  6 wallclock secs ( 4.89 usr +  0.00 sys =  4.89 CPU) @ 102.25/s (n=500)
   flipper: 14 wallclock secs ( 9.55 usr +  0.00 sys =  9.55 CPU) @ 52.36/s (n=500)
     map_i:  9 wallclock secs ( 4.83 usr +  0.00 sys =  4.83 CPU) @ 103.52/s (n=500)
    map_jm:  9 wallclock secs ( 4.57 usr +  0.00 sys =  4.57 CPU) @ 109.41/s (n=500)
   map_jm2:  8 wallclock secs ( 4.62 usr +  0.00 sys =  4.62 CPU) @ 108.23/s (n=500)
  map_pure:  6 wallclock secs ( 4.81 usr +  0.00 sys =  4.81 CPU) @ 103.95/s (n=500)
   shifter:  4 wallclock secs ( 3.53 usr +  0.00 sys =  3.53 CPU) @ 141.64/s (n=500)
   splicer:  8 wallclock secs ( 5.50 usr +  0.00 sys =  5.50 CPU) @ 90.91/s (n=500)
           Rate flipper aslice splicer c_loop map_i map_pure map_jm2 map_jm shifter
flipper  52.4/s      --   -34%    -42%   -49%  -49%     -50%    -52%   -52%    -63%
aslice   79.9/s     53%     --    -12%   -22%  -23%     -23%    -26%   -27%    -44%
splicer  90.9/s     74%    14%      --   -11%  -12%     -13%    -16%   -17%    -36%
c_loop    102/s     95%    28%     12%     --   -1%      -2%     -6%    -7%    -28%
map_i     104/s     98%    30%     14%     1%    --      -0%     -4%    -5%    -27%
map_pure  104/s     99%    30%     14%     2%    0%       --     -4%    -5%    -27%
map_jm2   108/s    107%    35%     19%     6%    5%       4%      --    -1%    -24%
map_jm    109/s    109%    37%     20%     7%    6%       5%      1%     --    -23%
shifter   142/s    171%    77%     56%    39%   37%      36%     31%    29%      --

The code

#! /usr/bin/perl -w

use strict;
use Benchmark qw/cmpthese/;
use Data::Dumper;

$Data::Dumper::Indent = 0;

my $iters = shift || 1000;
my $len   = shift || 8;

sub flipper {
    my @r2;
  my $flipper = 0;
  for( @_ ) {
    if( $flipper++ % 2 ) {
      push @{$r2[$#r2]}, $_;
    }
    else {
       push @r2, [ $_ ];
    }
  }
  \@r2;
}

sub map_i {
  my $i = 0;
  [ map { [$_[$i*2], $_[$i++*2+1]] } (0 .. scalar(@_)/2 - 1) ];
}

sub shifter {
  my @r2;
  while( @_ ) {
    push @r2, [ shift @_, shift @_ ];
  }
  \@r2;
}

sub c_loop {
  my @r2;
  for( my $i=0; $i < scalar @_; $i += 2 ) {
    push @r2, [ @_[$i, $i+1] ]
  }
  \@r2;
}

sub map_pure {
  [ map{ [$_[$_*2], $_[$_*2+1]] } (0 .. scalar @_ / 2 - 1) ];
}

sub aslice {
  my @r2;
  $r2[$_] = [ @_[$_*2, $_*2+1] ] for (0 .. scalar @_ / 2 - 1);
  \@r2;
}

sub map_jm {
  [ map { [$_[2*$_], $_[2*$_+1]] } 0 .. scalar @_ / 2 - 1 ];
}

sub map_jm2 {
  [ map { [@_[2*$_, 2*$_+1]] } 0 .. scalar @_ / 2 - 1 ];
}

sub splicer {
  my @r2;
  while (my @pair = splice @_, 0, 2) {
    push @r2, [@pair];
  }
  \@r2;
}

######################################################

my $r = [ 1 .. $len ]; 
for my $sub( qw/ flipper shifter splicer c_loop aslice
    map_i map_pure map_jm map_jm2 /) {
  no strict 'refs';
  # test
  # print "$sub ", Dumper( &$sub( @$r )), "\n";
}

cmpthese(
  $iters,
  {
    flipper  => sub { flipper( @$r )},
    shifter  => sub { shifter( @$r )},
    splicer  => sub { splicer( @$r )},
    c_loop   => sub { c_loop( @$r )},
    aslice   => sub { aslice( @$r )},
    map_i    => sub { map_i( @$r )},

    map_pure => sub { map_pure( @$r )},
    map_jm   => sub { map_jm( @$r )},
    map_jm2  => sub { map_jm2( @$r )},
  },
);
__END__
[download]

I find it interesting the code that I consider as being the easiest for a non-expert to read is also the fastest... And when you see how bad my initial idea was, you know understand why I went asking in the CB (I blame lack of coffee/lack of sleep/early morning).

print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

In reply to Processing arrays 2 elements at a time (TIMTOWTDI) by grinder

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.