in reply to The cost of unchecked best practices

Off topic: Perhaps single char positive character classes should be optimized to be equivalent to a literal character? Perhaps something can be done about single char negative character classes too.

Update: Corion sent me a message saying he thought this was already done in 5.10, so I put it to the test. Indeed, it is.

Test code used: (Same as OP accept avoided extra sub calls)

#!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); our $line = ('a' x 500) . ' ' . ('a' x 20); sub code { "use strict; use warnings; $_[0]; 1;" } cmpthese -2, { literal => code(' our $line =~ / a .{1,10} \ /x; '), class => code(' our $line =~ / a .{1,10} [ ] /x; '), class_nodot => code(' our $line =~ / a [^\n]{1,10} [ ] /smx; '), };
>c:\progs\perl588\bin\perl 674979.pl Rate class_nodot class literal class_nodot 2206/s -- -44% -100% class 3908/s 77% -- -100% literal 805341/s 36409% 20507% -- >c:\progs\perl5100\bin\perl 674979.pl Rate class_nodot literal class class_nodot 618394/s -- -12% -13% literal 704734/s 14% -- -1% class 708360/s 15% 1% --

Replies are listed 'Best First'.
Re^2: The cost of unchecked best practices
by bart (Canon) on Mar 20, 2008 at 20:18 UTC
    Does that mean that, apart from these optimizations, the regex engine got actually slower in other cases? I see that the benchmark for "literal" dropped from 8E5/sec to 7E5/sec. That's a speed drop of about 12%. I assume these benchmarks are run on the same computer...
      Yes, same computer. I just reran the tests (3 times each, one after the other), and got the same results as before:
      >c:\progs\perl588\bin\perl 674979.pl Rate class_nodot class literal class_nodot 3037/s -- -24% -100% class 4008/s 32% -- -100% literal 859954/s 28218% 21354% -- >c:\progs\perl588\bin\perl 674979.pl Rate class_nodot class literal class_nodot 3041/s -- -26% -100% class 4087/s 34% -- -100% literal 868745/s 28464% 21158% -- >c:\progs\perl588\bin\perl 674979.pl Rate class_nodot class literal class_nodot 3041/s -- -26% -100% class 4117/s 35% -- -100% literal 834248/s 27329% 20162% -- >c:\progs\perl5100\bin\perl 674979.pl Rate class_nodot literal class class_nodot 667282/s -- -10% -11% literal 741037/s 11% -- -1% class 747754/s 12% 1% -- >c:\progs\perl5100\bin\perl 674979.pl Rate class_nodot literal class class_nodot 667704/s -- -10% -12% literal 742924/s 11% -- -2% class 758776/s 14% 2% -- >c:\progs\perl5100\bin\perl 674979.pl Rate class_nodot literal class class_nodot 657342/s -- -11% -14% literal 740217/s 13% -- -3% class 766213/s 17% 4% --

      So yes, something is slower. Not necessarily the regexp, but it's likely.