in reply to Re^5: Perl regexp matching is slow??
in thread Perl regexp matching is slow??

Yes, you're looking at the right graph. If you look at the left hand side, that's running a match on a single-byte string, so essentially all the time is in construction.

Unless I missed something looking at the timing scripts (.tar.gz), you're comparing the run-time of entire programs -- e.g. perl/ruby/etc programs -- versus your compiled C-code. That doesn't seem like a useful comparison of regexp algorithms.

Are you really measuring the load time of Perl, the compilation time of the perl regexp test program and its execution and comparing that to the run time of your NFA code that is just a couple hundred lines of C?

The other test programs all seem to be interpreted or call sh to execute a command line tool. Whereas your README suggests that the comparable nfa test program is a compiled C executable being run directly.

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

  • Comment on Is this apples vs oranges? (was Re^6: Perl regexp matching is slow??)

Replies are listed 'Best First'.
Re: Is this apples vs oranges? (was Re^6: Perl regexp matching is slow??)
by rsc (Acolyte) on Feb 05, 2007 at 02:28 UTC
    You're right. I am timing entire programs.

    But I am timing one program running enough iterations of the match to take 10 seconds worth of CPU time, up to a max of 10,000 iterations. If you assume Perl starts in less than 10ms (it loads in well under that on my machine), then the extra time added by startup is no more than 1 microsecond in the final numbers. Also, the point was really the difference in growth rates: the 60+ seconds for backtracking isn't being spent during program load. (And PCRE is a C program too.)

      Fair point. I missed the number of iterations in my quick scan through the code (more descriptive variable names would have helped). My question didn't relate to the 60+ second range but the fraction of a second range, since it looked in the large graph as if your special-purpose code had a near-order-of-magnitude "head start" over the average general purpose tool examined.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.