Converting a stringified regexp back into a regexp

mirod has asked for the wisdom of the Perl Monks concerning the following question:

I need to go back from the stringified form of a regexp to a real regexp. The regexp is stringified when I use it as the key to a hash (which I have to do as the current API dictates the use of a hash here).

The little bit of code below explains what happens and what I have done to get the regexp back.

#!/usr/bin/perl -w
use strict;

my $regexp = qr/^x/i;                             # a regexp

# stringify it
my $hash   = { $regexp => 1 };                    # keys are stringifi
+ed
my $string = (keys %$hash)[0];                    # a string

print "$string (ref: ", ref $string,")\n";        # and I prove it!
                                                  # the string is: '(?
+i-xsm:^x)'

# regex-ify it
if( $string=~m{^\(\?([xism]*)-([xism]*):(.*)\)$}) # match the interest
+ing bits    
  { my $nregexp= qr/(?$1:$3)/;                    # rebuild it
    print "$nregexp (ref: ", ref $nregexp,")\n";  # a regexp again! 
                                                  # it stringifies as 
+'(?-xism:(?i:^x))'
  }
[download]

My questions are: did I forget anything? Does it work accross versions of perl (I tested it under 5.8.6)? Is there anything simple that I forgot/did not know about?

Comment on Converting a stringified regexp back into a regexp Download Code

Replies are listed 'Best First'.

Re: Converting a stringified regexp back into a regexp
by ikegami (Patriarch) on Apr 07, 2005 at 14:35 UTC

For efficiency purposes, you might want to keep the non-stringified regexp inside the hash. In other words, changing:

$hash{"$regexp"} = $value;

...
$value  = $hash{$key};
$regexp = qx/$key/;  # Expensive
[download]

$hash{"$regexp"} = [ $regexp, $value ];

...
$value  = $hash{$key}[1];
$regexp = $hash{$key}[0];
[download]

That way, you won't have to recompile the regexp.

OR! If you can't change the hash's structure, keep the compiled regexp in a second hash:

$hash{"$regexp"} = $value;
$regexps{"$regexp"} = $regexp;

...
$value = $hash{$key};
$regexp = $regexps{$key};
[download]

[reply]
[d/l]
[select]

Re^2: Converting a stringified regexp back into a regexp

by tlm (Prior) on Apr 07, 2005 at 14:52 UTC

A variant of this idea would be to memoize qr:

{
  my %cache;
  sub my_qr { $cache{ $_[0] } ||= qr/$_[0]/ }
}
[download]

the lowliest monk

[reply]
[d/l]
[select]

Re^2: Converting a stringified regexp back into a regexp

by mirod (Canon) on Apr 07, 2005 at 14:51 UTC

Indeed I only do the re-building of the regexp once, afterwards I keep a hash $hash{"$regexp"} = { regexp => $regexp, value => $value }; (actually it's handler instead of value, but you get the idea).

update: sorry, I did not understand your comment I think. I _have_ to get the regexp passed as a string, in order to keep backwards compatibility. And as this is for XML::Twig, I don't think I can go about changing the code that uses it ;--(

[reply]
[d/l]

Re: Converting a stringified regexp back into a regexp
by Jenda (Abbot) on Apr 07, 2005 at 15:31 UTC

Why don't you just $nregexp = qr/$string/;? So you end up with an aditional (?-xism:...) around the regexp if you stringify it again, so what?

Jenda
We'd like to help you learn to help yourself
Look around you, all you see are sympathetic eyes
Stroll around the grounds until you feel at home
-- P. Simon in Mrs. Robinson

[reply]
[d/l]
[select]

Re^2: Converting a stringified regexp back into a regexp

by mirod (Canon) on Apr 07, 2005 at 16:03 UTC

It makes sense. I still have to figure out that the string is a regexp though (it can also be a regular string), so the extra $nregexp= qr/(?$1:$3)/; is cheap. $nregexp = qr/$string/; might be a little safer though, in case the syntax for regexp changes (but then I would probably have to change the regexp that matches it anyway).

[reply]
[d/l]
[select]

Re: Converting a stringified regexp back into a regexp
by Roy Johnson (Monsignor) on Apr 07, 2005 at 15:50 UTC

Caution: Contents may have been coded under pressure.

[reply]

Re^2: Converting a stringified regexp back into a regexp

by mirod (Canon) on Apr 07, 2005 at 16:06 UTC

Thanks, applied.

I had always dreamed to write this :--)

[reply]

Re: Converting a stringified regexp back into a regexp
by ysth (Canon) on Apr 07, 2005 at 16:54 UTC

I think you want m{}s if your regex may have literal newlines. And if all the xism flags are on, there won't be a -.

[reply]

Re^2: Converting a stringified regexp back into a regexp

by mirod (Canon) on Apr 07, 2005 at 16:58 UTC

I am matching XML element names, so probably no newlines there. But indeed I have to make the second group optional: m{^$\?([xism]*)(?:-[xism]*)?:(.*)$$}.

[reply]
[d/l]

Re^3: Converting a stringified regexp back into a regexp

by Anonymous Monk on Apr 07, 2005 at 17:28 UTC

How would you deal with 'g' or 'o' modifiers: i.e. m/blah/g

[reply]

Re^4: Converting a stringified regexp back into a regexp

by mirod (Canon) on Apr 07, 2005 at 17:49 UTC

Re^4: Converting a stringified regexp back into a regexp

by Anonymous Monk on Apr 07, 2005 at 19:04 UTC

Re: Converting a stringified regexp back into a regexp
by tlm (Prior) on Apr 07, 2005 at 14:37 UTC

I don't see anything wrong with your code, but I have never attempted this. I would just use strings like '^x' as hash keys, and then build the regexps as needed from these plain string keys.

the lowliest monk

[reply]