in reply to Re^3: pcre vs perl regex engine
in thread pcre vs perl regex engine
libpcre's utf-8 regexing is very slow, that some globals being used are not thread-safe and may cause memory gains and possible crashes in threads, etc.Well, Perl's regexp engine has efficientcy problems with UTF-8 as well (think character classes). Perl (and its regexp engine) have had their share of memory leaks (and I don't think many people are willing to bet they're all gone). And while I'm not aware of any thread related problems with respect to the regexp engine, threads in Perl is such a performance loss; if you build a perl, by default, threads are disabled. And while there may not be many thread issues in the regexp engine due to perl not sharing anything by default between threads, perl's regexp engine isn't reentrant. And it took only 20 years to make it non-recursive (but artifacts of its recursive past still pop up).
Also I would point out that even though the perl regex engine is written in C, it is not the same C code that the pcre lib is made from.Noone was argueing that both implementations shared code.
I'm saying the C code in Perl's regex engine may run faster for many regexes than the C code in the pcre lib; and that may be due to the difficulty of setting up the rest of the C program to run the regexes in the most efficient manner.That I don't get. perl (lowercase) is also a C program, and that needs to set up things before it can run the regexp engine as well.
You can do the match faster in Perl, because the regex is often a 1 liner, dosn't programmer time count in this?No, that would be silly. PCRE is a *library*. Perl is a *programming language*. Of course, if you're going to start from scratch, Perl is going to beat PCRE. In the same way a bicycle is going to beat a V8 engine - you can cycle quite a number of miles before you've build a car around the V8 engine. But once the car is there, it becomes a different matter. With the same reasoning, 99.99% of the matches done will be faster done by humans, because humans can scan it faster in their heads than typing your Perl 1-liner in an editor.
PCRE is meant as a library: to be build into other applications/languages. In such an application or language, it may also be a 1-liner to do the match.
I googled, and havn't found real benchmark comparing the 2 engines with a good set of regex stress tests, maybe you could use your knowledge to setup a benchmark, and post it.Why should I? I'm not the one who makes unfounded claims about one being faster than the other. But rest assured, if I were to claim Perl was faster than PCRE (or the other way around), I would back it up with a benchmark. But I know enough about Perl regexes to not make any claims compared to PCRE (or any other regexp engine). And I do know enough to be critical about other peoples claims.
So, if you're so sure about Perls superiority, post the benchmark. Make sure you also include the matches were Perl does take a long time. And remember that when Perl appears to be fast in matching/not matching, it's usually because it doesn't need the regexp engine at all - because the optimizer already figured it out.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: pcre vs perl regex engine
by zentara (Cardinal) on Jan 18, 2009 at 10:59 UTC |