Some of you may have wondered why I didn't just benchmark it myself. Because I was being stupid. I looked at tedv's benchmark and couldn't believe I hadn't bothered to check. Blame it on over two-weeks straight of working without a day off :(

A critical portion of regex optmization is what occurs on failure. I created a string which had at least one failure for the regex and then benchmarked the comparisons:

#!/usr/bin/perl -w use strict; use Benchmark; my ( $re, $test ); timethese(-30, { regex1 => '$re = "a" . \'\d\'x500 . "a"; $test = ("a"x2000 . "1"x499)x2 . "a" . "1"x500 . ("a"x +2000 . "1"x499)x2 ; $test =~ /$re/o;', regex2 => '$re = "a" . \'\d{500}\' . "a"; $test = ("a"x2000 . "1"x499)x2 . "a" . "1"x500 . ("a"x +2000 . "1"x499)x2 ; $test =~ /$re/o;', });
Results:
Benchmark: running regex1, regex2, each for at least 30 CPU seconds... regex1: 31 wallclock secs (31.31 usr + 0.00 sys = 31.31 CPU) @ 65 +5.10/s (n=20508) regex2: 31 wallclock secs (30.84 usr + 0.00 sys = 30.84 CPU) @ 53 +0.35/s (n=16358)
After running this a couple of times, I see that \d{2} is less efficient than \d\d, though not by much. If there is terribly convoluted data I am iterating over, though, it could be an issue.

I still don't see why I didn't benchmark it before asking. Thanks for slapping sense into me, tedv :)

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just go the the link and check out our stats.


In reply to (Ovid) RE(2): Regex optimizations by Ovid
in thread Regex optimizations by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.