in reply to Re^3: Surprisingly poor regex performance
in thread Surprisingly poor regex performance

You're right of course; I shouldn't have said meant the same as; I meant would have the same effect as.

Still, I think it's accurate to say that ^ means very nearly the same as your example: (?:\A|\n), so it's still quite surprising to me that yours is so much faster.

Replies are listed 'Best First'.
Re^5: Surprisingly poor regex performance
by dragonchild (Archbishop) on Dec 14, 2004 at 13:53 UTC
    I'm actually really curious about that, as well. I /msg'ed japhy asking if he'd pop in and help us out. My benchmarking shows a 15x speedup using (?:\A|\n) over using ^ with the /m modifier.

    Oh - taking away the /m modifier when using (?:\A|\n) results in potentially something like a 1% speedup. I guess randomly adding modifiers is bad. :-)

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      How does it compare to (?:\A|(?<=\n)) ? Isn't that the accurate representation of /^.../m ? You could also try (?<![^\n]) which seems somehow even simpler.

      - tye        

        Using the benchmarking code above ...
        timethese( 100, { 'better' => sub { while ($mmap =~ m/(?:\A|\n)(.*$pat.*\n)/omg) { } }, 'tye1' => sub { while ($mmap =~ m/(?:\A|(?<=\n))(.*$pat.*\n)/omg) { } }, 'tye2' => sub { while ($mmap =~ m/(?:\A|(?<![^\n]))(.*$pat.*\n)/omg) { } }, }) __END__ Benchmark: timing 100 iterations of better, tye1, tye2... better: 11 wallclock secs (10.61 usr + 0.00 sys = 10.61 CPU) @ 9 +.43/s (n=100) tye1: 13 wallclock secs (12.37 usr + 0.00 sys = 12.37 CPU) @ 8 +.08/s (n=100) tye2: 14 wallclock secs (14.16 usr + 0.00 sys = 14.16 CPU) @ 7 +.06/s (n=100)

        I have no idea how to figure out why these numbers are the way they are. The only thing I can think of is that the lookaround assertion takes more time than the actual character check. *shrugs*

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.