in reply to You don't always have to use regexes

Overuse of regexes is one of my favorite pet peeves also.

As you point out, eq will sometimes do everything you need. Other times all you need is index. For example, if your regex didn't have anchors: if ( $value =~ /true/i ) You could write instead  if ( index( lc $value, "true" ) >= 0 )

Replies are listed 'Best First'.
Re^2: You don't always have to use regexes
by kvale (Monsignor) on Feb 23, 2005 at 16:32 UTC
    I think that for the index case the situation is not so clear. Both the regex engine and index() will use the same Boyer-Moore routine and for me personally, the regex version is more readable. But as always, YMMV.
    use Benchmark qw(:all) ; my $value = 'FALSE'; my $count = 1_000_000; cmpthese($count, { 'regex' => sub { $value =~ /^true$/i }, 'eq' => sub { lc $value eq "true" }, 'index' => sub { index( lc $value, "true" ) >= 0 }, });
    yields
    Benchmark: timing 1000000 iterations of eq, index, regex... eq: 1 wallclock secs ( 0.89 usr + 0.00 sys = 0.89 CPU) @ 11 +23595.51/s (n=1000000) index: 2 wallclock secs ( 1.65 usr + 0.00 sys = 1.65 CPU) @ 60 +6060.61/s (n=1000000) regex: 2 wallclock secs ( 1.63 usr + 0.00 sys = 1.63 CPU) @ 61 +3496.93/s (n=1000000) Rate index regex eq index 606061/s -- -1% -46% regex 613497/s 1% -- -45% eq 1123596/s 85% 83% --
    Update: As AM has pointed out (thank you!), the benchmark above has a bug. Using the tests
    'regex' => sub { $value =~ /true/i }, 'regex_anch' => sub { $value =~ /^true$/i }, 'eq' => sub { lc $value eq "true" }, 'index' => sub { index( lc $value, "true" ) >= 0 },
    I get the results
    Benchmark: timing 1000000 iterations of eq, index, regex, regex_anch.. +. eq: 1 wallclock secs ( 0.88 usr + 0.00 sys = 0.88 CPU) @ 11 +36363.64/s (n=1000000) index: 0 wallclock secs ( 1.65 usr + 0.00 sys = 1.65 CPU) @ 60 +6060.61/s (n=1000000) regex: 0 wallclock secs ( 1.08 usr + 0.00 sys = 1.08 CPU) @ 92 +5925.93/s (n=1000000) regex_anch: 2 wallclock secs ( 1.59 usr + 0.00 sys = 1.59 CPU) @ 62 +8930.82/s (n=1000000) Rate index regex_anch regex eq index 606061/s -- -4% -35% -47% regex_anch 628931/s 4% -- -32% -45% regex 925926/s 53% 47% -- -19% eq 1136364/s 87% 81% 23% --
    with the surprising result that the regex w/o the anchor is faster than the anchored version. Multiple runs yield similar results. As the AM says, one could try many different regex-value combos, but I expect the results to be not far different, precisely because both index and regex engine use the same BM function.

    -Mark

      You Benchmark is significantly flawed for the question asked. The OR (original replier) wanted to compare  index(lc $value,"true") with  $value =~ /true/i; In addition, to fairly benchmark one should try multiple test case (set  $value to "true", a short string, and a longer string in your test, and in a fair test, set it to: 'true', 'ashortstringthentrue', 'averylongstringthentrue', and different size strings without 'true' in them.
Re^2: You don't always have to use regexes
by holli (Abbot) on Feb 23, 2005 at 16:39 UTC
    I benchmarked this and it yields an interesting result. index() is (a bit) faster than a regex. If itīs used in combination with lc(), as in your example, the regex with the i-modifier is faster.
    use strict; use warnings; use Benchmark; my $value = "somewhere here true is there!"; timethese ( 9000000, { 'index' => sub { index( $value, "true" ) }, 'regex' => sub { $value =~ /true/ }, } ); timethese ( 9000000, { 'index' => sub { index( lc $value, "true" ) }, 'regex' => sub { $value =~ /true/i }, } ); Benchmark: timing 9000000 iterations of index, regex... index: 2 wallclock secs ( 2.02 usr + 0.00 sys = 2.02 CPU) @ 44 +46640.32/s (n=9000000) regex: 4 wallclock secs ( 2.40 usr + -0.01 sys = 2.39 CPU) @ 37 +60969.49/s (n=9000000) Benchmark: timing 9000000 iterations of index, regex... index: 4 wallclock secs ( 4.55 usr + 0.00 sys = 4.55 CPU) @ 19 +79762.43/s (n=9000000) regex: 3 wallclock secs ( 3.68 usr + 0.00 sys = 3.68 CPU) @ 24 +48313.38/s (n=9000000)


    Update:
    Ack. I really need to learn to type faster.


    holli, /regexed monk/