lemnisca has asked for the wisdom of the Perl Monks concerning the following question:

Part of the program I am working on needs to convert regular expressions in vim style to Perl regular expressions. I'm reading them in from a file and converting them at runtime.

I've been looking through the vim documentation and found this description of the differences between the two regex styles. Some of those don't look too difficult to deal with - for instance, I can easily change \c to (?i). However, for most of them I'm not sure how I'd go about implementing them.

For example, I need to change \%(atom) into (?:atom). At first glance that looks ok, but then I started thinking about nested parentheses. If I use a greedy quantifier, it might eat up too much of the string, since it will go to the last closing parenthesis in the whole pattern. If I use a non-greedy quantifier, it won't work for something like /\%(foo(bar))/. Is there a way to make this work?

After reading this explanation of the 'magic' modifiers, I think they would be possible to do, if fairly complicated given the number of things I need to check for and replace. Same goes for the differing newline behaviour - it looks somewhat tricky but doable if I have the patience. :) However, I'm not sure how I could successfully match the 'atom' for the zero-width assertions, and the \& modifier looks like a nightmare; can anyone think of a good way to deal with either of these?

This is a project I'm doing to help me learn Perl. It's not anything critical, so in the end if there's no way to convert some of those incompatibilities then I'll probably just ignore them. :P But if anyone can come up with some ideas for nice ways to do this I'd really appreciate it, since I'd like my program to be able to correctly interpret as many patterns as possible. Alternatively, if anyone knows of some magical CPAN module that would do this for me, that'd be great - I searched but unfortunately couldn't find one.

Replies are listed 'Best First'.
Re: Converting vim regex to Perl regex
by duff (Parson) on Jan 27, 2006 at 06:32 UTC

    From your description it sounds like you're using regular expressions to convert from vim-style REs to perl-style REs. What you really need is a parser that groks vim-style REs from which you can output the appropriate perl-style REs.

    BTW, \& isn't a nightmare, you need to turn A\&B\&C into (?=A)(?=B)C. Though now that I think about it for more than 2 seconds, I suppose the "consumption" of the string would have to be nailed down. My version always consumes enough of the string to match the last pattern and I can see arguments for consuming the minimum or maximum of any of the patterns.

    Anyway, you need a real parser. Some things weren't meant to be grokked by regular expressions :)