Re: You don't always have to use regexes

I suppose, the whole meaning of this example is to show how to programm efficiently - not wasting system ressources (here: CPU time).

If this is so, I'd like to put emphasis on the fact, that NO ONE here seems to see a problem in the "true" expression. Please do not use interpolation if you do not need it. Try your benchmarks with 'true' again.

Update:

Of course I did the benchmarks before posting this node. The speed differences are not extraordinary but constantly about 5%

Code:

#!/usr/bin/perl_parallel -w
# For Emacs: -*- mode:cperl; mode:folding -*-

use strict;
use warnings;

use Benchmark;

my $value = "somewhere here true is there!";

timethese
(
    5000000,
    {
            'index' => sub { index( $value, 'true' ) },
            'regex' => sub { $value =~ /true/ },
    }
);

timethese
(
    5000000,
    {
            'index' => sub { index( lc $value, 'true' ) },
            'regex' => sub { $value =~ /true/i },
    }
);
[download]

Benchmark Results (single quotes):

Benchmark: timing 5000000 iterations of index, regex...
     index:  2 wallclock secs ( 1.04 usr +  0.00 sys =  1.04 CPU) @ 48
+07692.31/s (n=5000000)
     regex:  2 wallclock secs ( 2.16 usr +  0.01 sys =  2.17 CPU) @ 23
+04147.47/s (n=5000000)
Benchmark: timing 5000000 iterations of index, regex...
     index:  3 wallclock secs ( 2.83 usr +  0.00 sys =  2.83 CPU) @ 17
+66784.45/s (n=5000000)
     regex:  4 wallclock secs ( 3.45 usr +  0.02 sys =  3.47 CPU) @ 14
+40922.19/s (n=5000000)
[download]

Benchmark Results (double quotes):

Benchmark: timing 5000000 iterations of index, regex...
     index:  1 wallclock secs ( 1.10 usr +  0.00 sys =  1.10 CPU) @ 45
+45454.55/s (n=5000000)
     regex:  2 wallclock secs ( 2.25 usr +  0.01 sys =  2.26 CPU) @ 22
+12389.38/s (n=5000000)
Benchmark: timing 5000000 iterations of index, regex...
     index:  4 wallclock secs ( 2.99 usr +  0.01 sys =  3.00 CPU) @ 16
+66666.67/s (n=5000000)
     regex:  3 wallclock secs ( 3.66 usr +  0.01 sys =  3.67 CPU) @ 13
+62397.82/s (n=5000000)
[download]

Perl:

perl_parallel -V
Summary of my perl5 (revision 5 version 8 subversion 6) configuration:
  Platform:
    osname=linux, osvers=2.6.8-24-smp, archname=i586-linux-thread-mult
+i
    uname='linux builder 2.6.8-24-smp #1 smp wed oct 6 09:16:23 utc 20
+04 i686 i686 i386 gnulinux '
    config_args='-ds -e -Dprefix=/opt/PM_perl-5.8.6 -Dvendorprefix=/op
+t/PM_perl-5.8.6 -Dinstallusrbinperl -Dusethreads -Di_db -Di_dbm -Di_n
+dbm -Di_gdbm -Duseshrplib=true -Doptimize=-O2 -g -march=i586 -mcpu=i6
+86 -fmessage-length=0 -Wall -pipe'
[download]

Bye
PetaMem All Perl: MT, NLP, NLU

Comment on Re: You don't always have to use regexes Select or Download Code

Replies are listed 'Best First'.
Re^2: You don't always have to use regexes by Tanktalus (Canon) on Feb 24, 2005 at 22:52 UTC
Actually, many of us saw it. But we also saw this: Re: To Single Quote or to Double Quote: a benchmark. The point is, the difference in speed is practically meaningless. In the grand scheme of the transition from `$value =~ /true/i` to `lc $value eq "true"`, changing that to `lc $value eq 'true'` is going to have a demonstrably small effect.	[reply] [d/l] [select]
Re^3: You don't always have to use regexes by bmann (Priest) on Feb 24, 2005 at 23:41 UTC
And to support your point, an invariant string inside double-quotes gets compiled down to a single quoted string. Any time wasted is not wasted at run-time. `$cat print.pl print 'Hello'; print "Hello"; # compiles to 'Hello' print "Hello $_"; $perl -MO=Deparse print.pl print 'Hello'; print 'Hello'; print "Hello $_"; print.pl syntax OK` [download] 5.005_03, 5.6.1 and 5.8.4 produce identical results.	[reply] [d/l]
Re^4: You don't always have to use regexes by Ven'Tatsu (Deacon) on Feb 25, 2005 at 14:28 UTC
Minor nitpick, if your going to use B::Deparse to show how perl handles strings internaly consider using the `-q` option. From B::Deparse: Expand double-quoted strings into the corresponding combinations of concatenation, uc, ucfirst, lc, lcfirst, quotemeta, and join. ... Note that the expanded form represents the way perl handles such constructions internally -- this option actually turns off the reverse translation that B::Deparse usually does. `$perl -MO=Deparse,-q print.pl print 'Hello'; print 'Hello'; print 'Hello ' . $_; print.pl syntax OK` [download]	[reply] [d/l] [select]
Re^2: You don't always have to use regexes by petdance (Parson) on Feb 25, 2005 at 03:06 UTC
I suppose, the whole meaning of this example is to show how to programm efficiently - not wasting system ressources (here: CPU time). Absolutely not. That has nothing to do with it. CPU efficiencies on the scale that we're talking about are irrelevant. The point is to use the construct that most closely matches the semantics of what you're trying to achieve. If you're wondering if one string is the word "true", then that's not a pattern match, it's a string comparison. xoxo, Andy	[reply]
Re^3: You don't always have to use regexes by PetaMem (Priest) on Feb 25, 2005 at 15:35 UTC
If you're wondering if one string is the word "true", then that's not a pattern match, it's a string comparison. Ok, I second that. Probably I was mislead by the immediate popup of benchmarks in this thread. Bye PetaMem All Perl: MT, NLP, NLU	[reply]


Keep It Simple, Stupid
	PerlMonks