Re: Pluggable regex engine in Perl
by ELISHEVA (Prior) on Dec 27, 2010 at 08:48 UTC
|
I'm sure there are people who do take this as an invitation to write their own "just cause", but the practical reason for a custom regex engine would be the need to emulate the behavior of some other regex syntax. Bash, PHP, Java, various flavors of the grep command, all have their own slight regular expression nuances.
For example, a person might need to port a large amount of code from PHP to Perl. Rather that study each and every regex in that code, it might make more sense to port the code but leave the original regular expressions in place. Converting syntax is usually fairly straight-forward. Deciphering and converting regular expressions, not always so. You would have that option with a pluggable regex API.
| [reply] |
Re: Pluggable regex engine in Perl
by moritz (Cardinal) on Dec 27, 2010 at 09:27 UTC
|
I've heard that the regex API has changed quite a bit between 5.10 and 5.12 due to the promotion of regexes to first class objects. If that's true, one should either just target 5.12 or newer when writing a new plugin, or be aware of the differences and use some #ifdefs.
I wonder if any of these are in use and what are the use cases? What issues do these plugs solve that is impossible, harder, slower or have some other issues with the standard regex engine of Perl?
The now deceased re::engine::TRE had two features that made it attractive for some uses: for one it would match the longest of several alternations (instead of the first, as Perl does), and secondly it uses a non-backtracking state machine internally whenever possible, which means that pathological exponential time behavior doesn't occur as easily as with the Perl engine.
| [reply] [d/l] |
Re: Pluggable regex engine in Perl
by dgl (Novice) on Dec 27, 2010 at 12:16 UTC
|
I'm the author of re::engine::RE2.
As for motivation it was mostly to learn a bit about this area of Perl, however I do see uses for RE2 due to its matching being much faster than Perl's matching.
For example combined with an mmaped scalar I can match a regexp on 1 GiB of text in about 10 seconds (on a core 2 duo), Perl's RE doesn't even come close to that. You can see how Google's Code search can be so fast.
There's some issues with Perl's UTF-8 handling (frankly it's insane), but once I've worked around that re::engine::RE2 should be nearly a drop in replacement for Perl's RE, but faster.
| [reply] |
|
|
| [reply] |
Re: Pluggable regex engine in Perl
by Khen1950fx (Canon) on Dec 27, 2010 at 09:06 UTC
|
Why not write your own engine that uses perl? You could get started by reading re::engine::Plugin to write the engine. It's still in its early stages but workable. | [reply] |
Re: Pluggable regex engine in Perl
by JavaFan (Canon) on Dec 27, 2010 at 15:56 UTC
|
I haven't written a plugin, but I don't find it hard to come up with reasons to write plugins:
- You want a different syntax (perhaps to have identical syntax as language X).
- You want different rules (for instance, POSIX's "longest match" preference over Perl's "first match")
- You want an engine that's optimized for certain cases. Perhaps you want a pure DFA - sacrifizing functionality for speed.
- You may want to do matching in a particular encoding.
I've no idea whether any of the plugins have been written with those reasons in mind, but I wouldn't be surprised if someone at sometime does. | [reply] |
Re: Pluggable regex engine in Perl
by AnomalousMonk (Archbishop) on Dec 27, 2010 at 16:57 UTC
|
I have read that the TCL regex engine is a mixture of DFA and NFA: DFA is used when possible, NFA otherwise. Any comments on the truth of this notion, how appropriate this approach is, its performance vis-a-vis Perl's standard regex engine, and whether it might be available as a plug-in?
| [reply] |
|
|
| [reply] |
Re: Pluggable regex engine in Perl
by Anonymous Monk on Dec 28, 2010 at 04:05 UTC
|
Using this, Devel-Declare can one create a Perl 6 Grammar equivalent kind of a thing?
Just asking as such a thing will be of very great help.
| [reply] |
Re: Pluggable regex engine in Perl
by Anonymous Monk on Dec 28, 2010 at 15:57 UTC
|
Btw, it would be interesting to be able to use this from Perl:
https://github.com/dprokoptsev/pire
| [reply] |