Some of you may have wondered why I didn't just benchmark it myself. Because I was being stupid. I looked at tedv's benchmark and couldn't believe I hadn't bothered to check. Blame it on over two-weeks straight of working without a day off :(
A critical portion of regex optmization is what occurs on failure. I created a string which had at least one failure for the regex and then benchmarked the comparisons:
#!/usr/bin/perl -w
use strict;
use Benchmark;
my ( $re, $test );
timethese(-30, {
regex1 => '$re = "a" . \'\d\'x500 . "a";
$test = ("a"x2000 . "1"x499)x2 . "a" . "1"x500 . ("a"x
+2000 . "1"x499)x2 ;
$test =~ /$re/o;',
regex2 => '$re = "a" . \'\d{500}\' . "a";
$test = ("a"x2000 . "1"x499)x2 . "a" . "1"x500 . ("a"x
+2000 . "1"x499)x2 ;
$test =~ /$re/o;',
});
Results:
Benchmark: running regex1, regex2, each for at least 30 CPU seconds...
regex1: 31 wallclock secs (31.31 usr + 0.00 sys = 31.31 CPU) @ 65
+5.10/s (n=20508)
regex2: 31 wallclock secs (30.84 usr + 0.00 sys = 30.84 CPU) @ 53
+0.35/s (n=16358)
After running this a couple of times, I see that \d{2} is less efficient than \d\d, though not by much. If there is terribly convoluted data I am iterating over, though, it could be an issue.
I still don't see why I didn't benchmark it before asking. Thanks for slapping sense into me, tedv :)
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just go the the link and check out our stats. |