Jaspersan has asked for the wisdom of the Perl Monks concerning the following question:

Here is what i have:

#!/usr/bin/perl $regex = "(?{print \"Hey!\";}"; $string = "some word"; $string =~ /some($regex)/;


This dosn't work, and is there a way I can use the (?{ _code_ }) at runtime?

^jasper <jasper@wintermarket.org>

Replies are listed 'Best First'.
Re: RegEx Perl Code
by Ionitor (Scribe) on Jul 23, 2002 at 02:21 UTC
    vladb is right on as to the proper way to do this, but I thought I would comment on why it works that way.

    When you ran your code, you probably got this error:

    Eval-group not allowed at runtime, use re 'eval' in regex m/some((?{pr +int "Hey!";})/ at - line 6.
    Basically, Perl really doesn't like running code that was interpolated into a string. Since a scalar used in a regex could very likely be input by a user, Perl avoids doing any sort of double interpolation (first interpolate the scalar into the regex, then run the code). If you do put
    use eval 're';
    because you know what you're doing, and put the paren at the end of the string that you seemed to forget, the code does what you expect.
Re: RegEx Perl Code
by vladb (Vicar) on Jul 23, 2002 at 02:00 UTC
    This worked for me:
    $regex = qr/(?{print "Hey!";})/x; $string = "some word"; $string =~ /some($regex)/;

    You could readmore on the qr// operator in the perlop documentation. Here's an excerpt for quick reference :)
    qr/STRING/imosx This operator quotes (and possibly compiles) its STRING as a regular expression. STRING is interpolated the same way as PATTERN in "m/PATTERN/". If "'" is used as the delimiter, no interpolation is done. Returns a Perl value which may be used instead of the corresponding "/STRING/imosx" expression. For example, $rex = qr/my.STRING/is; s/$rex/foo/; is equivalent to s/my.STRING/foo/is; The result may be used as a subpattern in a match: $re = qr/$pattern/; $string =~ /foo${re}bar/; # can be interpolated in +other patterns $string =~ $re; # or used standalone $string =~ /$re/; # or this way Since Perl may compile the pattern at the moment of execution of qr() operator, using qr() may have speed advantages in some situations, notably if the result of qr() is used standalone: sub match { my $patterns = shift; my @compiled = map qr/$_/i, @$patterns; grep { my $success = 0; foreach my $pat (@compiled) { $success = 1, last if /$pat/; } $success; } @_; } Precompilation of the pattern into an internal representation at the moment of qr() avoids a need to recompile the pattern every time a match "/$pat/" is attempted. (Perl has many other internal optimizations, but none would be triggered in the above example if we did not use qr() operator.)


    _____________________
    # Under Construction
Re: RegEx Perl Code
by Zaxo (Archbishop) on Jul 23, 2002 at 02:15 UTC

    It just needs some repair for syntax problems:

    #!/usr/bin/perl $regex = qr/(?{print "Hey!";}).../; $string = "some word"; $string =~ s/some($regex)/$1 x 5/e; print $string,$/;
    Your parens in $regex weren't closed. I added qr// because it's a good thing to do, and some dots to match and a substitution just to demonstrate.

    Update: The 'quote regex' operator, qr//, precompiles the regex before storing it. That improves performance in the same way as m//o and s///o, but the result is stored rather than scoped as /o is. Precompilation gains nothing if the regex is only used once, but if the regex is to be stored in a variable anyway, it does no harm that I know of.

    After Compline,
    Zaxo

      I added qr// because it's a good thing to do,
      Could you explain why you think it's a good thing to do?

      Abigail

      I'm not convinced that so called "compiled regular expressions" are significantly faster. Here's a benchmark, matching IP addresses. One using the 'variable' approach, and one using compiled regular expressions:
      #!/usr/bin/perl use strict; use warnings 'all'; use Benchmark; use vars qw /$ip_v $ip_re @data/; my $quad_v = q '(?:25[0-5]|2[0-4]\d|1\d\d|\d\d?)'; my $quad_re = qr '(?:25[0-5]|2[0-4]\d|1\d\d|\d\d?)'; my $sep_v = q '\.'; my $sep_re = qr '\.'; $ip_v = qq "$quad_v$sep_v$quad_v$sep_v$quad_v$sep_v$quad_v"; + $ip_re = qr "$quad_re$sep_re$quad_re$sep_re$quad_re$sep_re$quad_ +re"; @data = map {join "." => map {int rand 1000} 1 .. 4} 1 .. 1_000; timethese -5 => { var => 'for (@data) {/$ip_v/}', re => 'for (@data) {/$ip_re/}', }; __END__
      Running this results in:
      Benchmark: running re, var for at least 5 CPU seconds...
              re:  5 wallclock secs ( 5.25 usr +  0.00 sys =  5.25 CPU) @ 46.10/s (n=242)
             var:  5 wallclock secs ( 5.16 usr +  0.00 sys =  5.16 CPU) @ 45.93/s (n=237)
      
      Not what I call a significant win for compiled regular expressions. Perhaps you have examples where the gain is large - I've yet to encounter them.

      Abigail

        Perhaps you have examples where the gain is large - I've yet to encounter them.

        jryan's benchmark code mentioned above can be tweaked with qr for a significant speed optimization. Change the line     my @patterns = ('B.B', 'CB')x10; to     my @patterns = map qr/$_/, ('B.B', 'CB')x10; in the &without subroutine.

        Your benchmark doesn't show any support for compiled patterns, and that's due to an optimization I've already described here.

        Cheers,
        -Anomo

        I ran a few benchmarks here on a "contrived" large regex and large dataset; although the speed increase through qr// isn't what you'd call large, it is at least significant. Should be noted, though, that the the so-called "evil" /o was the winner in this case.

        At any rate, thats not to say qr// is bad. Personally, I think its great because it encourages modularity and readability in regular expressions.