in reply to Re^2: Perl regexp matching is slow??
in thread Perl regexp matching is slow??
Then it would be plenty safe to take a regexp from the user and use it to run a search. But with the current Perl regexes, no way.It depends on the implementation. If you use Thompson's algorithm as implemented in nfa.c from the article, then you end up with a run-time that is O(m*n) for length-m regexp and length-n text.
Doesnt this criticism apply equally to Thompsons algorithm or to DFA construction? I would have thought the only difference would be that in a DFA you'd see performance issues with compilation and not execution.
And just to be clear, that is no more incompatible with whatever optimizations you might add (like exact string search) than backtracking is.This is a myth. Like most myths it is based in a truth that is no longer true. When people precomputed the entire DFA before starting the search, like the original egrep did, construction was in fact pretty expensive. But no one does that anymore. They build out the DFA on the fly, as described in the article. If you build out the DFA on the fly, then the "compilation" or "construction" passes are more or less the same as for backtracking: basically you just have to syntax check the regexp and build some internal representation of a parse tree. If you look at the big graph near the end of the article, you'll see that the nfa.c in the article has cheaper construction cost than all of the fancier implementations, by about a factor of 4. And I wasn't really trying.
Sure. But the question is will Construction time + FBM + Verification be faster for an BNFA (backtracking NFA) than for a DFA? And will the DFA consume radically more memory than the BNFA? And my position is that most likely the DFA will win only on degenerate patterns. The rest of the time my feeling is that the BNFA will win hands down, mostly because of how cheap construction is.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: Perl regexp matching is slow??
by demerphq (Chancellor) on Feb 02, 2007 at 14:06 UTC | |
by rsc (Acolyte) on Feb 04, 2007 at 02:49 UTC | |
by demerphq (Chancellor) on Feb 04, 2007 at 10:22 UTC | |
by rsc (Acolyte) on Feb 04, 2007 at 20:24 UTC | |
by demerphq (Chancellor) on Feb 04, 2007 at 23:28 UTC | |
by xdg (Monsignor) on Feb 04, 2007 at 23:57 UTC | |
by rsc (Acolyte) on Feb 05, 2007 at 02:28 UTC | |
by xdg (Monsignor) on Feb 05, 2007 at 12:23 UTC | |
by ysth (Canon) on Feb 04, 2007 at 03:06 UTC |