split has an optimisation that occurs under very specific circumstances.
$ perl -MO=Concise,-exec -e'my @a = split //, $buf;'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark s
4 </> pushre(/""/) s/64
5 <#> gvsv[*buf] s
6 <$> const[IV 0] s
7 <@> split[t3] lK
8 <0> pushmark s
9 <0> padav[@a:1,2] lRM*/LVINTRO
a <2> aassign[t4] vKS
b <@> leave[1 ref] vKP/REFC
-e syntax OK
$ perl -MO=Concise,-exec -e'our @a = split //, $buf;'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 </> pushre(/""/ => @a) s/64
4 <#> gvsv[*buf] s
5 <$> const[IV 0] s
6 <@> split[t5] vK
7 <@> leave[1 ref] vKP/REFC
-e syntax OK
In the second program, the assignment is removed and split stores the result into the array itself. It's as if
our @a = split(//, $buf);
became
split_into(@a, //, $buf);
Unfortunately, it doesn't currently work for lexicals.
$ perl -MO=Concise,-exec -e'my @a; @a = split //, $buf;'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> padav[@a:1,2] vM/LVINTRO
4 <;> nextstate(main 2 -e:1) v:{
5 <0> pushmark s
6 </> pushre(/""/) s/64
7 <#> gvsv[*buf] s
8 <$> const[IV 0] s
9 <@> split[t3] lK
a <0> pushmark s
b <0> padav[@a:1,2] lRM*
c <2> aassign[t4] vKS
d <@> leave[1 ref] vKP/REFC
-e syntax OK
When taking advantage of that optimisation, one gets a solution that's almost twice as fast as the previous solution:
Rate regex unpack_C unpack_a split split_pa
regex 10.4/s -- -2% -44% -50% -73%
unpack_C 10.6/s 2% -- -43% -49% -73%
unpack_a 18.8/s 79% 76% -- -10% -52%
split 20.9/s 100% 97% 12% -- -46%
split_pa 38.8/s 271% 264% 107% 85% --
use strict;
use warnings;
use Benchmark qw( cmpthese );
my %tests = (
split => q{ my @a = split //, $buf; },
split_pa => q{ local our @a; @a = split //, $buf; },
regex => q{ my @a = $buf =~ /./sg; },
unpack_C => q{ my @a = map chr, unpack 'C*', $buf; },
unpack_a => q{ my @a = unpack '(a)*', $buf; },
);
$_ = "use strict; use warnings; our \$buf; $_ 1"
for values(%tests);
local our $buf = "abcdef\x00ghik" x 10_000;
cmpthese(-2, \%tests);
|