mirod has asked for the wisdom of the Perl Monks concerning the following question:

I need to go back from the stringified form of a regexp to a real regexp. The regexp is stringified when I use it as the key to a hash (which I have to do as the current API dictates the use of a hash here).

The little bit of code below explains what happens and what I have done to get the regexp back.

#!/usr/bin/perl -w use strict; my $regexp = qr/^x/i; # a regexp # stringify it my $hash = { $regexp => 1 }; # keys are stringifi +ed my $string = (keys %$hash)[0]; # a string print "$string (ref: ", ref $string,")\n"; # and I prove it! # the string is: '(? +i-xsm:^x)' # regex-ify it if( $string=~m{^\(\?([xism]*)-([xism]*):(.*)\)$}) # match the interest +ing bits { my $nregexp= qr/(?$1:$3)/; # rebuild it print "$nregexp (ref: ", ref $nregexp,")\n"; # a regexp again! # it stringifies as +'(?-xism:(?i:^x))' }

My questions are: did I forget anything? Does it work accross versions of perl (I tested it under 5.8.6)? Is there anything simple that I forgot/did not know about?

Replies are listed 'Best First'.
Re: Converting a stringified regexp back into a regexp
by ikegami (Patriarch) on Apr 07, 2005 at 14:35 UTC

    For efficiency purposes, you might want to keep the non-stringified regexp inside the hash. In other words, changing:

    $hash{"$regexp"} = $value; ... $value = $hash{$key}; $regexp = qx/$key/; # Expensive

    to

    $hash{"$regexp"} = [ $regexp, $value ]; ... $value = $hash{$key}[1]; $regexp = $hash{$key}[0];

    That way, you won't have to recompile the regexp.

    OR! If you can't change the hash's structure, keep the compiled regexp in a second hash:

    $hash{"$regexp"} = $value; $regexps{"$regexp"} = $regexp; ... $value = $hash{$key}; $regexp = $regexps{$key};

      A variant of this idea would be to memoize qr:

      { my %cache; sub my_qr { $cache{ $_[0] } ||= qr/$_[0]/ } }
      though one loses some of the modifier syntax.

      the lowliest monk

      Indeed I only do the re-building of the regexp once, afterwards I keep a hash $hash{"$regexp"} = { regexp => $regexp, value => $value }; (actually it's handler instead of value, but you get the idea).

      update: sorry, I did not understand your comment I think. I _have_ to get the regexp passed as a string, in order to keep backwards compatibility. And as this is for XML::Twig, I don't think I can go about changing the code that uses it ;--(

Re: Converting a stringified regexp back into a regexp
by Jenda (Abbot) on Apr 07, 2005 at 15:31 UTC

    Why don't you just $nregexp = qr/$string/;? So you end up with an aditional (?-xism:...) around the regexp if you stringify it again, so what?

    Jenda
    We'd like to help you learn to help yourself
    Look around you, all you see are sympathetic eyes
    Stroll around the grounds until you feel at home
       -- P. Simon in Mrs. Robinson

      It makes sense. I still have to figure out that the string is a regexp though (it can also be a regular string), so the extra $nregexp= qr/(?$1:$3)/; is cheap. $nregexp = qr/$string/; might be a little safer though, in case the syntax for regexp changes (but then I would probably have to change the regexp that matches it anyway).

Re: Converting a stringified regexp back into a regexp
by Roy Johnson (Monsignor) on Apr 07, 2005 at 15:50 UTC
    Nit: you needlessly capture $2. It doesn't even need grouped, let alone captured.

    Caution: Contents may have been coded under pressure.

      Thanks, applied.

      I had always dreamed to write this :--)

Re: Converting a stringified regexp back into a regexp
by ysth (Canon) on Apr 07, 2005 at 16:54 UTC
    I think you want m{}s if your regex may have literal newlines. And if all the xism flags are on, there won't be a -.

      I am matching XML element names, so probably no newlines there. But indeed I have to make the second group optional:  m{^\(\?([xism]*)(?:-[xism]*)?:(.*)\)$}.

        How would you deal with 'g' or 'o' modifiers: i.e. m/blah/g
Re: Converting a stringified regexp back into a regexp
by tlm (Prior) on Apr 07, 2005 at 14:37 UTC

    I don't see anything wrong with your code, but I have never attempted this. I would just use strings like '^x' as hash keys, and then build the regexps as needed from these plain string keys.

    the lowliest monk