comment on

Thanks! I learned how to use benchmark tonight! I added (I think) middle whitespace removal regexes to your test routines (using runrig's benchmark example from your "already mentioned" link above. What amazes me is (if I understand the benchmark output) is that the split/join method is THE fastest way to trim extra whitespace. Hope I modified the other regexes "the best way" to do the equivalent functionality of trimall_v1?

use strict;
use Benchmark 'cmpthese';
my $str = "   a b  c   d    ";

cmpthese(-5, {
 ALTERNATE=>\&alternate,
 LTSAVE=>\&lt_save,
 LTSEXEGER=>\&ltsexeger,
 WHILE_SUB=>\&while_sub,
 TRIMALL_V1=>\&trimall_v1,
 TRIMALL_V2=>\&trimall_v2,
});

# Using regex
sub trimall_v1 {
    local $_ = $str;
    s/^\s+//;
    s/\s+$//;
    s/\s+/ /g;
    $_;
}

# Using specialized split on ' ' and $_
sub trimall_v2 {
    local $_ = $str;
    $_= join ' ',split;
}

# Used runrig's benchmark example, but made ones below trim leading, t
+railing, and extra whitespace
sub alternate {
 local $_ = $str;
 s/^\s+|\s+$//g;
 s/\s+/ /g;
 $_;
}

sub lt_save {
 local $_ = $str;
 s/^\s*(.*?)\s*$/$1/;
 s/\s+/ /g;
 $_;
}

sub ltsexeger {
 local $_ = reverse $str;
 s/^\s+//;
 $_ = reverse $str;
 s/^\s+//;
 s/\s+/ /g;
}

sub while_sub {
 local $_ = $str;
 1 while s/^\s//;
 1 while s/\s$//;
 1 while s/\s\s/ /;
 $_;
} 

=Benchmarks
Benchmark: running ALTERNATE, LTSAVE, LTSEXEGER, TRIMALL_V1, TRIMALL_V
+2, WHILE_SUB,
           each for at least 5 CPU seconds...
 ALTERNATE:  5 wallclock secs ( 5.10 usr +  0.00 sys =  5.10 CPU) @  8
+0293.53/s (n=409497)
    LTSAVE:  6 wallclock secs ( 5.27 usr +  0.00 sys =  5.27 CPU) @  4
+8685.39/s (n=256572)
 LTSEXEGER:  5 wallclock secs ( 5.00 usr +  0.00 sys =  5.00 CPU) @ 12
+4402.60/s (n=622013)
TRIMALL_V1:  5 wallclock secs ( 5.12 usr +  0.00 sys =  5.12 CPU) @ 12
+1486.91/s (n=622013)
TRIMALL_V2:  5 wallclock secs ( 5.05 usr +  0.00 sys =  5.05 CPU) @ 15
+4931.88/s (n=782406)
 WHILE_SUB:  6 wallclock secs ( 5.00 usr +  0.00 sys =  5.00 CPU) @  6
+8421.40/s (n=342107)
               Rate   LTSAVE WHILE_SUB ALTERNATE TRIMALL_V1 LTSEXEGER 
+TRIMALL_V2

LTSAVE      48685/s       --      -29%      -39%       -60%      -61% 
+      -69%
WHILE_SUB   68421/s      41%        --      -15%       -44%      -45% 
+      -56%
ALTERNATE   80294/s      65%       17%        --       -34%      -35% 
+      -48%
TRIMALL_V1 121487/s     150%       78%       51%         --       -2% 
+      -22%
LTSEXEGER  124403/s     156%       82%       55%         2%        -- 
+      -20%
TRIMALL_V2 154932/s     218%      126%       93%        28%       25% 
+        --
=cut
[download]

In reply to Re: regex verses join/split by aquacade
in thread regex verses join/split by aquacade

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.