shotgunefx has asked for the wisdom of the Perl Monks concerning the following question:

I recently discovered the very handy Regexp::Common module. (I've seen it before but never checked it out) I was thinking it would be nice if you could pull out expressions for those times where you don't want to take the speed hit of loading and using the module. So I looked into YAPE::Regex::Explain which seemed to do the trick.

My question is does anyone see any caveats to this approach? Reading the pods, it doesn't seem like there would be any problems but out of my Perl skills, regexes are certainly my weakest area. Are there other ways of getting the source form of a regex? I couldn't really find too much on the subject.

-Lee

"To be civilized is to deny one's nature."

Replies are listed 'Best First'.
Re: Decompiling Regular Expressions
by diotalevi (Canon) on Apr 03, 2003 at 18:57 UTC

    Three ways in order of increasing difficulty

    • Use regex objects: print qr/(?:this)(?!didn't)match/
    • Use the debug mode of the 're' module. use re 'debug';/(?:this)(?!didn't)match/
    • Use Dominus' Rx module.

    I list the Rx module last because when I last looked at it I it required patching first. Its also probably the one you want to use if you are serious about this.

      Thanks for the pointers. I don't mind using YAPE at all, just not sure if there are any gotchas I should look for. I don't intend for my use to serialize these or otherwise store them, just simply add them to the program source for the projects where I don't want the overhead.

      -Lee

      "To be civilized is to deny one's nature."

        Oh I see. Well YAPE doesn't cover all of regex-dom you'll be safe as long as you stick to the YAPE-subset of supported regex constructs. I only gave you advice for solving the more general "decompile a regex" problem. YAPE partially solves the problem "parse a regex" which is quite different and eventually devolves to "parse perl".

Re: Decompiling Regular Expressions
by Abigail-II (Bishop) on Apr 03, 2003 at 20:46 UTC
    I'm not quite sure what you mean by "pulling out expressions", but you might want to know that Regexp::Common groups the patterns in classes, and you can selectively load the classes. For instance:
    use Regexp::Common qw /number URI/;

    only loads the number and URI patterns. Currently, there are 11 different classes.

    From your post, I didn't get any idea what you mean by "getting the source form of a regex" in relation to Regexp::Common, nor what YAPE::Regex::Explain is doing for you.

    Abigail

      My question I'm sure could have been phrased better. Basically what I was thinking is there are certain times when you might want to use an expression without requiring the module. (Normally I would just want to stick with "use"ing it and getting the benefit of the module's maintainence), so I was curious if there was a way to go from a compiled expression to something I could add to a script without requiring the module for the one regex.

      A simple example but if I wanted a pattern to match a real but didn't want to require Regexp::Common I could do something like this.
      #!/usr/bin/perl use warnings; use strict; use Regexp::Common qw/number/; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new($RE{num}{real})->explain(); __END__ The regular expression: (?-imsx:(?:(?i)(?:[+-]?)(?:(?=[0123456789]|[.])(?:[0123456789]*)(?:(?: +[.])(?:[0123456789]{0,}))?)(?:(?:[E])(?:(?:[+-]?)(?:[0123456789]+))|) +)) matches as follows: NODE EXPLANATION ------------------------------------------------------------BLAH BLAH +BLAH
      and just add
      $thing=~/(?-imsx:(?:(?i)(?:[+-]?)(?:(?=[0123456789]|[.])(?:[0123456789 +]*)(?:(?:[.])(?:[0123456789]{0,}))?)(?:(?:[E])(?:(?:[+-]?)(?:[0123456 +789]+))|))) /
      to my code.

      Which seems like it works but as regex's are my weakest Perl strength, I didn't know if this is a any caveats (things known not to work) with this approach or if there is a more reliable way of doing it. Obviously "cutting and pasting" is a bad idea without total understanding is in general though as far as regexes, I'm willing to put stock in a few monks work (yourself included), but I didn't know if there was anything inherently wrong with the approach.

      -Lee

      "To be civilized is to deny one's nature."
        That's like going from London to Greenwhich via Sydney. Why not just:
        perl -MRegexp::Common -wle 'print $RE{num}{real}'

        Note that this (or the YAPE::) approach doesn't work with recursive regexps.

        Abigail