Re: pcre vs perl regex engine
by zentara (Cardinal) on Jan 16, 2009 at 19:24 UTC
|
See Analyzing regular expression performance . I can't remember where I read it, but Perl regexes are often faster than c, because of various reasons......like the Perl engine is better developed, it was designed for parsing text, etc. I'm not saying always, but often Perl is supposedly faster. Even the pcre man page points to times when it is slow, due to idiosyncracies of c.
| [reply] |
|
|
| [reply] |
|
|
Well JavaFan you are too smart for me to argue with, but googling for "pcre regex speed" comes up with complaints that libpcre's utf-8 regexing is very slow, that some globals being used are not thread-safe and may cause memory gains and possible crashes in threads, etc. Also I would point out that even though the perl regex engine is written in C, it is not the same C code that the pcre lib is made from. So I'm not saying that C is faster than C, I'm saying the C code in Perl's regex engine may run faster for many regexes than the C code in the pcre lib; and that may be due to the difficulty of setting up the rest of the C program to run the regexes in the most efficient manner. But isn't that what is all about? You can do the match faster in Perl, because the regex is often a 1 liner, dosn't programmer time count in this? jeteve didn't specifically say only machine-time speed, although that was probably his intention.
I googled, and havn't found real benchmark comparing the 2 engines with a good set of regex stress tests, maybe you could use your knowledge to setup a benchmark, and post it.
| [reply] |
|
|
|
|
|
|
Re: pcre vs perl regex engine
by mr_mischief (Monsignor) on Jan 16, 2009 at 19:34 UTC
|
PCRE and especially the regex engines for strictly POSIX-compliant regexes don't do nearly as much as the Perl regex engine. Do you mean you want to know which is fastest within the narrow confines of what they all can do? | [reply] |
Re: pcre vs perl regex engine
by Joost (Canon) on Jan 16, 2009 at 20:28 UTC
|
CL-PPCRE (a regex engine implemented in pure Common Lisp, mostly compatible with perl's regex engine) is apparently pretty fast.
Some comparisons can be found here. In short, it's usually a little faster than Perl, sometimes about twice as slow, and sometimes a lot faster.
| [reply] |
Re: pcre vs perl regex engine
by jettero (Monsignor) on Jan 16, 2009 at 19:14 UTC
|
I'd like to see some numbers on that and on the POSIX regex engine. But any benchmarks would need a huge variety of patterns as, no doubt, some of the engines are designed with different purposes in mind than others.
| [reply] |
|
|
| [reply] |
|
|
Then it's not much of a test. How a RE engine handles backtracking should at the very least be relevant.
| [reply] |
Re: pcre vs perl regex engine
by thunders (Priest) on Jan 16, 2009 at 20:12 UTC
|
How exactly would you go about benchmarking this? It's my understanding that due to a bunch of perl specific features, you can't easily embed perl's regex engine alone in a C program( you could of course embed an entire perl interpreter). And obviously it wouldn't make much sense to embed pcre in a perl program. So I can't think of a way to do an apples to apples comparison.
You could time a series of perl and c programs that run a variety of regexes over various types of input, for a least common denominator feature set. But for most programs there are a number of other factors that will impact performance more than the choice of regex engine (interpreted vs compiled, i/o libraries, memory management, GC, etc)
But beyond writing programs that only do regex matching, I think the choice between pcre and perl is whether you want to write the rest of the program in C/C++ or in Perl.
| [reply] |