Re^2: Performance problems on splitting long strings

Borrowing heavily from Kenosis' code (thanks), regex seems to be faster than unpack (at least using substitution):

             Rate _substr _unpack  _regex  _split
_substr 2187335/s      --    -11%    -16%    -20%
_unpack 2457294/s     12%      --     -6%    -10%
_regex  2612321/s     19%      6%      --     -4%
_split  2726283/s     25%     11%      4%      --
[download]

Perl code:

use strict;
use warnings;
use Benchmark qw/cmpthese/;

my $string = join '', 'A' .. 'Y';

sub _unpack {
    my @arr = unpack '(A5)*', $string;
}

sub _regex {
    my @arr;
    while (length $string){ $string =~ s/^(.{5})//; push @arr, $1; }
}

sub _split {
    my @arr = split /.{5}\K/, $string;
}

sub _substr {
    my @arr;

    for ( my $i = 0 ; $i < length $string ; $i += 5 ) {
        push @arr, substr $string, $i, 5;
    }
}

cmpthese(
    -5,
    {
        _unpack => sub { _unpack() },
        _split  => sub { _split() },
        _substr => sub { _substr() },
        _regex  => sub { _regex() }
    }
);
[download]

Comment on Re^2: Performance problems on splitting long strings Select or Download Code

Replies are listed 'Best First'.
Re^3: Performance problems on splitting long strings by BrowserUk (Patriarch) on Jan 31, 2014 at 16:30 UTC
Your benchmark is totally broken. When your _regex() function runs the first time, it complete destroys `$string`; and everytime after that the regex is operating on an empty string and thus runs very quicly. Ditto, every other test that runs after the first run of _regex(). With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]

Replies are listed 'Best First'.

Re^3: Performance problems on splitting long strings
by BrowserUk (Patriarch) on Jan 31, 2014 at 16:30 UTC

Your benchmark is totally broken.

When your _regex() function runs the first time, it complete destroys $string; and everytime after that the regex is operating on an empty string and thus runs very quicly. Ditto, every other test that runs after the first run of _regex().

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]
[d/l]