Something that comes up fairly often is a need to split a string to equal sized chunks. For instance, given the string "abcdefgh12345678", splitting it to 4-char chunks would produce ("abcd", "efgh", "1234", "5678"). Looking around the monastery, there're at least a
couple of
posts I have found.
I tried to time some different techniques against each other:
my $str = "abcdefgh12345678" x 20;
my $strlen = length $str;
cmpthese(50000, {
'grep_split' => sub
{
my @arr = grep {$_} split /(.{8})/, $str;
},
'split_pos' => sub
{
my @arr = split /(?(?{pos() % 8})(?!))/, $str;
},
'substr_map' => sub
{
my $len = length $str;
my @arr = map {substr($str, $_ * 8, 8)} (0 .. $strlen / 8 - 1);
},
'substr_loop' => sub
{
my @arr;
my $len = length $str;
for (my $i = 0; $i < $len; $i += 8)
{
push(@arr, substr($str, $i, 8));
}
},
'unpack' => sub
{
my @arr = unpack('(A8)*', $str);
}
});
And the results are quite surprising:
Rate
split_pos 3203/s
grep_split 6425/s
substr_map 8889/s
unpack 11348/s
substr_loop 15097/s
Contrary to what I have expected from my understanding (that built in functions should be faster than loops), the looping solution is the swiftest. It beats the unpack by a margin ranging from 15 to 50 percent, depending on the length of the string and the chunks.
Any way to make it faster ?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.