Re: Performance problems on splitting long strings

Just fyi:

use strict;
use warnings;
use Tie::CharArray;
use Benchmark qw/cmpthese/;

my $string = join '', 'A' .. 'Y';

sub _unpack {
    my @arr = unpack '(A5)*', $string;
}

sub _regex {
    my @arr = $string =~ /.{5}/g;
}

sub _split {
    my @arr = split /.{5}\K/, $string;
}

sub _substr {
    my @arr;

    for ( my $i = 0 ; $i < length $string ; $i += 5 ) {
        push @arr, substr $string, $i, 5;
    }
}

sub _open {
    my @arr;

    open my $sh, '<', \$string;
    while ( read $sh, my $chars, 5 ) {
        push @arr, $chars;
    }
}

cmpthese(
    -5,
    {
        _unpack => sub { _unpack() },
        _regex  => sub { _regex() },
        _split  => sub { _split() },
        _substr => sub { _substr() },
        _open   => sub { _open() }
    }
);
[download]

Output:

            Rate   _open  _regex _substr  _split _unpack
_open   265986/s      --    -53%    -55%    -57%    -70%
_regex  563780/s    112%      --     -5%     -8%    -36%
_substr 593788/s    123%      5%      --     -3%    -33%
_split  612001/s    130%      9%      3%      --    -31%
_unpack 881949/s    232%     56%     49%     44%      --
[download]

Comment on Re: Performance problems on splitting long strings Select or Download Code

Replies are listed 'Best First'.
Re^2: Performance problems on splitting long strings by SimonPratt (Friar) on Jan 31, 2014 at 15:39 UTC
Borrowing heavily from Kenosis' code (thanks), regex seems to be faster than unpack (at least using substitution): `Rate _substr _unpack _regex _split _substr 2187335/s -- -11% -16% -20% _unpack 2457294/s 12% -- -6% -10% _regex 2612321/s 19% 6% -- -4% _split 2726283/s 25% 11% 4% --` [download] Perl code: use strict; use warnings; use Benchmark qw/cmpthese/; my $string = join '', 'A' .. 'Y'; sub _unpack { my @arr = unpack '(A5)*', $string; } sub _regex { my @arr; while (length $string){ $string =~ s/^(.{5})//; push @arr, $1; } } sub _split { my @arr = split /.{5}\K/, $string; } sub _substr { my @arr; for ( my $i = 0 ; $i < length $string ; $i += 5 ) { push @arr, substr $string, $i, 5; } } cmpthese( -5, { _unpack => sub { _unpack() }, _split => sub { _split() }, _substr => sub { _substr() }, _regex => sub { _regex() } } ); [download]	[reply] [d/l] [select]
Re^3: Performance problems on splitting long strings by BrowserUk (Patriarch) on Jan 31, 2014 at 16:30 UTC
Your benchmark is totally broken. When your _regex() function runs the first time, it complete destroys `$string`; and everytime after that the regex is operating on an empty string and thus runs very quicly. Ditto, every other test that runs after the first run of _regex(). With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]