comment on

Curiouser and curiouser. The change of tests, changes the relative performance of the methods, but it also slows all of them down.

I thought that using globals instead of lexicals might have been part of the difference, and it is, but only a small part.

#! perl -slw
use strict;
use Benchmark qw[ cmpthese ];

our $TEST ||= 0;
our $N = $TEST ? 10 : $N || 1000;
our @data = map{ join' ', '2004-05-13', '14:02:00', ('blah') x (1+rand
+( 9 )) } 1 .. $N;

our (@greedy, @explicit, @unpack);

cmpthese( $TEST ? 1 : -1, {
    our_g   => '@greedy   = map {/(^\S*)\s(\S*)\s(.*$)/}        @data'
+,
    our_e   => '@explicit = map {/(^\d{4}\-\d{2}\-\d{2})\s
                                   (\d{2}:\d{2}:\d{2})\s(.*$)/x} @data
+',
    our_u   => '@unpack   = map {unpack "A10 x A8 x A*" => $_}  @data'
+,

    my_g    => 'my @greedy   = map {/(^\S*)\s(\S*)\s(.*$)/}        @da
+ta',
    my_e    => 'my @explicit = map {/(^\d{4}\-\d{2}\-\d{2})\s
                                   (\d{2}:\d{2}:\d{2})\s(.*$)/x} @data
+',
    my_u    => 'my @unpack   = map {unpack "A10 x A8 x A*" => $_}  @da
+ta',

    greedy => q[
        my( $date, $time, $text );
        
        m[(^\S*)\s(\S*)\s(.*$)]
            and ( $date, $time, $text ) = ( $1, $2, $3 )
#            and $TEST and print "greedy: $date|$time|$text"
            for @data;
    ],
    explicit => q[
        my( $date, $time, $text );
        
        m[(^\d{4}\-\d{2}\-\d{2})\s(\d{2}:\d{2}:\d{2})\s(.*$)]
            and ( $date, $time, $text ) = ( $1, $2, $3 )
#            and $TEST and print "explicit: $date|$time|$text"
            for @data;
    ],
    unpackA => q[
        use bytes;
        my( $date, $time, $text );

        ( $date, $time, $text ) = unpack 'A10 x A8 x A*', $_
#            and $TEST and print "unpackA: $date|$time|$text"
            for @data;
    ],
    substr => q[
        use bytes;
        my( $date, $time, $text );

        ( $date, $time, $text ) = 
            ( 
                substr( $_, 0, 10 ),
                substr( $_, 11, 8 ),
                substr( $_, 20 )
            )
#            and $TEST and print "substr: $date|$time|$text"
            for @data;
    ],
});
    
__END__
P:\test>362106
           Rate our_e our_g our_u  my_e my_g my_u unpackA substr expli
+cit greedy
our_e    72.4/s    --   -2%  -15%  -28% -30% -43%    -55%   -73%     -
+77%   -79%
our_g    73.6/s    2%    --  -13%  -27% -29% -42%    -54%   -73%     -
+77%   -79%
our_u    85.0/s   17%   15%    --  -16% -18% -33%    -47%   -69%     -
+73%   -75%

my_e      101/s   39%   37%   19%    --  -3% -20%    -37%   -63%     -
+68%   -71%
my_g      104/s   43%   41%   22%    3%   -- -18%    -35%   -62%     -
+67%   -70%
my_u      126/s   74%   71%   48%   25%  21%   --    -21%   -53%     -
+60%   -64%

unpackA   160/s  121%  117%   88%   59%  54%  27%      --   -41%     -
+49%   -54%
substr    270/s  273%  267%  218%  168% 160% 114%     69%     --     -
+14%   -22%
explicit  314/s  334%  327%  270%  212% 203% 149%     96%    16%      
+ --    -9%
greedy    346/s  378%  370%  307%  243% 234% 175%    116%    28%      
+10%     --
[download]

It would be interesting to see the benchmark run on 5.6.2 (pre-unicodification), which I don't have installed currently.

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

In reply to Re^2: fast greedy regex by BrowserUk
in thread fast greedy regex by js1

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.