Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

O Wise Ones,

In my quest to compare equivalent regular expressions, I have attempted to reduce duplicate code by moving the "testing" to a subroutine. The odd thing is that when moved into a subroutine, alternation takes about the same time as when using character classes.

-------------
Without the sub:
(Output =
Alternation takes 1.395 seconds.
Character class 0.043 seconds.
)
-------------
use strict; use Time::HiRes 'time'; sub main { my $TimesToDo = 1000; my $TestString ="abababdedfg" x 1000; my $Count = $TimesToDo; my $StartTime = time(); while ($Count-- > 0) { $TestString =~m/^(a|b|c|d|e|f|g)+$/; } my $EndTime = time(); printf("Alternation takes %.3f seconds.\n", $EndTime - $StartTime) +; $Count = $TimesToDo; $StartTime = time(); while ($Count-- > 0) { $TestString =~m/^[a-g]+$/; } $EndTime = time(); printf("Character class %.3f seconds.\n", $EndTime - $StartTime); } unless (caller) {main ()}

----------
With the sub:
(Output =
Alternation takes 0.000 seconds.
Character class 0.001 seconds.
)
----------
use strict; use Time::HiRes 'time'; #TimesToDo, TestString, Regex sub test { my $TimesToDo = shift; my $TestString = shift() x 1000; my $Count = $TimesToDo; my $StartTime = time(); while ($Count-- > 0) { $TestString =~m/^$_[2]+$/; } my $EndTime = time(); return $EndTime - $StartTime; } sub main { my $result = test(1000,"abababdedfg","(a|b|c|d|e|f|g)"); printf("Alternation takes %.3f seconds.\n", $result); $result = test(1000,"abababdedfg","[a-g]"); printf("Character class %.3f seconds.\n", $result); } unless (caller) {main ()}

Very greatful for any responses!

Replies are listed 'Best First'.
Re: Benchmarking regexes
by zwon (Abbot) on Mar 17, 2009 at 23:57 UTC

    use warnings would help you to find the problem. After two shifts $_[2] becomes $_[0]

      Haha, thank you very, very, very much, it works just fine now :D I feel somewhat silly and quite humbled, but I guess that's what you're supposed to feel around monks. Happy trails!
Re: Benchmarking regexes
by moritz (Cardinal) on Mar 17, 2009 at 23:44 UTC
    Without having read the rest of your post:
    With the sub:
    (Output = Alternation takes 0.000 seconds. Character class 0.001 seconds. )

    I hope you didn't draw any conclusions from this "result"? It's clear that accuracy of your time measurement and/or output is too low to give you any gain in knowledge. If one of the values is there and the other is only one digit, you didn't really measure anything.

    I'd recommend to use Benchmark, which adjusts the repetition counts if the accuracy is too low.

      Thank you for your quick reply.

      No worries, no conclusions have been made, I was wondering why moving the test into a subroutine produces the second result instead of the first. I was planning on running quite a few examples and was hoping to minimize the amount of code by creating the subroutine test, but it isn't working as I hoped...

Re: Benchmarking regexes
by Anonymous Monk on Mar 17, 2009 at 23:43 UTC

    Things I forgot to mention:

    • The test is taken from Friedl's book Mastering Regular Expressions.
    • I was wondering what causes the difference in output.

    Thank you for any help!