in reply to Re: Re: Re: Regular Expression Question
in thread Regular Expression Question

Update: with regard to It's probably faster to use 2 regexps too: Yes, a quick Benchmarking shows that, with anchoring, the double-regex style runs about 50% faster than the single-regex solution I posted. (Perhaps one of the resident RegEx gurus can explain why this is?)
I'd be interested to see your benchmark (code + data), as I don't come to the same conclusion. The benchmark below shows the one regex solution to be somewhat faster - the data sample is tiny though.
#!/usr/bin/perl use strict; use warnings; use Benchmark qw /timethese cmpthese/; chomp (our @lines = <DATA>); our (@r1, @r2); cmpthese -10 => { one => '@r1 = map {/^\w+(?:,\w+)*$/ ? 1 : 0} @lines +', two => '@r2 = map {/^[\w,]+$/ && !/^,|,,|,$/ ? 1 : 0} @lines +', }; die "Unequal" unless "@r1" eq "@r2"; __DATA__ one,two,three,four,five ,one,two,three,four,five one,two,three,four,five, one,two,three,,four,five one,two,three four,five Rate two one two 25436/s -- -26% one 34417/s 35% --

Abigail

Replies are listed 'Best First'.
Re: Re: Regular Expression Question (show me the can^H^H^Hbenchmark)
by simonm (Vicar) on Dec 05, 2003 at 21:56 UTC
    I'd be interested to see your benchmark (code + data), as I don't come to the same conclusion.

    Test and output attached below. Looks like it is dependent on your data set...

    use strict; use Benchmark 'cmpthese'; my @data = <DATA>; my @long = map { join '', $_ x 100 } @data; my %cases = ( 'Single' => sub { for ( @long ) { /^\w+(?:,\w+)*$/ } }, 'Double' => sub { for ( @long ) { /^[\w,]+$/ && ! /^,|,,|,$/ } }, ); cmpthese( 0, \%cases); __DATA__ !@#$as3dfa ,sdfas3df, asd3fsa,,a3sdf as3df,asdf3,3asdf,asd3f sad3fasdjasdfkasdfklas3jf 3sad3fasdjasdfkasdfklas3jf 3sad3fasdjasdfkasdfklas3jf3
              Rate Single Double
    Single  6158/s     --   -83%
    Double 35319/s   474%     --