Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I guess I don't understand regular expression switches as well as I should.

What I am trying to do is dynamically create the pattern tested against within a regular expression. The code below attempts to show what I'm wanting to do.

#!/usr/bin/perl my $s = 'Tue Feb 8 11:11:11 2005: blah blah blah'; my $regex = '/^$wday $mon\s+$mday \d{2}:\d{2}:\d{2} $year:'; my $wday = 'Tue'; my $mon = 'Feb'; my $mday = '8'; my $year = '2005'; findString($s); sub findString($) { my $line = shift; print "found\n" if $line =~ /$regex/; }

However, executing this script doesn't output anything where as I was thinking that it should. What switch(es) am I missing?

Any insight you care to pass on would be greatly appreciated. Thanks.

Replies are listed 'Best First'.
Re: string substitution within regular expressions?
by Roy Johnson (Monsignor) on Feb 08, 2005 at 23:26 UTC
    Variables are interpolated when the pattern is examined -- that is, if you used qr/^$wday..../, it would try to substitute at that point. You're using single quotes, instead. So far so good.

    But when you actually use it, you have /$regex/, so $regex gets interpolated into the pattern you said it was. The dollar signs are end-of-string indicators. What you really want is to eval it.

    sub findString($) { my $line = shift; print "found\n" if $line =~ eval "qr/$regex/"; }
    The usual caveats about knowing that the contents of a string-eval are safe apply.

    You can also use the experimental delayed-evaluation feature, though it still requires the variables to be in scope:

    my $s = 'Tue Feb 8 11:11:11 2005: blah blah blah'; my ($wday, $mon, $mday, $year); my $regex = qr/^(??{$wday}) (??{$mon})\s+(??{$mday}) \d{2}:\d{2}:\d{2} + (??{$year}):/; $wday = 'Tue'; $mon = 'Feb'; $mday = '8'; $year = '2005'; sub findString; findString($s); sub findString($) { my $line = shift; print "found\n" if $line =~ /$regex/; }

    Caution: Contents may have been coded under pressure.
      Okay, utilizing the code suggested above, I get the expected behaviour. If I change the if-statement to the following, I get the errors noted further below.
      #!/usr/bin/perl my $s = 'Tue Feb 8 11:11:11 2005: blah blah blah'; my $regex = '^$wday $mon\s+$mday \d{2}:\d{2}:\d{2} $year:'; my $wday = 'Tue'; my $mon = 'Feb'; my $mday = '8'; my $year = '2005'; findString($s); sub findString($) { my $line = shift; # print "found\n" if $line =~ eval "qr/$regex/"; if ($line =~ eval "qr/$regex/") { print "found\n"; } }
      Do the following errors make sense? Yes, I'm using Perl 5.005, but I cannot change this at this point.
      Scalar found where operator expected at (eval 1) line 1, at end of lin +e (Missing operator before ?) Backslash found where operator expected at (eval 1) line 1, near "$mon +\" (Missing operator before \?) found
      Placing curly braces around the qr() string pacifies the interpreter, but I no longer see the expected answer. Thanks for any insight.
        The errors don't make any sense to me, given that it works in the non-block if and only fails with block-form if. Does this work any better? Does it fail before the if?
        sub findString($) { my $line = shift; my $compiled_regex = eval "qr/$regex/"; if ($line =~ /$compiled_regex/) { print "found\n"; } }

        Caution: Contents may have been coded under pressure.
      Thanks for your suggestions. In regards to your first answer using eval, isn't the e switch to regular expressions provide the same functionality?
        There is no e switch for regular expressions. The e switch is for substitutions, and applies to the replacement, not to the pattern.

        Caution: Contents may have been coded under pressure.
Re: string substitution within regular expressions?
by cowboy (Friar) on Feb 08, 2005 at 23:31 UTC
    You could handle this with an eval, although depending on where your input is coming from, that could be dangerous. (note, the / inside your regex will cause problems in your example)
    $regex = eval $regex;
    As a better alternative, you could use something like sprintf.
    my $s = 'Tue Feb 8 11:11:11 2005: blah blah blah'; my $regex = "^%s %s\s+%s \d{2}:\d{2}:\d{2} %s:'; my $wday = 'Tue'; my $mon = 'Feb'; my $mday = '8'; my $year = '2005'; findString($s, $wday, $mon, $mday, $year); sub findString { my $line = shift; my $pattern = sprintf($regex, @_); print "found\n" if $line =~ /$pattern/; }
Re: string substitution within regular expressions?
by ww (Archbishop) on Feb 09, 2005 at 03:26 UTC
    Much good info from cowboy, Roy J. et al.

    And if the code you posted is merely an attempt to show us concisely your desire, please ignore this... but if it's reflective of your actual plan of attack, the reference to "dynamically" suggests (to me, anyway) that you need to think thru the meaning of "dynamically" or maybe provide a different term which reflects your actual intent.

    The elements of the regex -- $wday, $mon, $mday and $year -- are hard coded here. Only if you actually intend to get their values from STDIN or somesuch, this would qualify (in my book, which may not be yours) as "dynamic."

    and suggest perlretut and perlrequick for more on regexen.

Re: string substitution within regular expressions?
by sh1tn (Priest) on Feb 08, 2005 at 23:56 UTC
    Not so good regexs:
    my $s = 'Tue Feb 8 11:11:11 2005: blah blah blah'; my $regex = { 'wday' => qr{^\s*(\w+)\s+}, 'mon' => qr{^\w+\s+(\w+)}, 'mday' => qr{\s+(\d{1,2})\s+}, 'year' => qr{\s+(\d{4})}, 'time' => qr{((?:\d+:){2}\d+)} }; my %dt = findString($s); print "$dt{year} $dt{time} $dt{mday}"; #... sub findString($) { my $line = shift; my %dt; # date/time for(keys %$regex){ $line =~ /$regex->{$_}/ and $dt{$_} = $1 } #print "found year\n" if exists$dt{'year'} #or just return hash %dt }
Re: string substitution within regular expressions?
by Anonymous Monk on Feb 10, 2005 at 16:09 UTC
    You do NOT need to use eval if you use qr//. This is important because, as other people are saying, eval poses a security risk if someone were clever enough to close your regular expression and put arbitrary commands after it. Instead, you can do something like this:
    my $s='Tue Feb 8 11:11:11 2005: blah blah blah'; my $regex=qr(^$wday $mon\s+$mday \d{2}:\d{2}:\d{2} $year:); findstring($s); sub findstring($) { my $line=shift; print "found\n" if $line=~/$regex/; }
    I'm not sure off the top of my head if qr// will interpolate your variables. If it doesn't, then you just do this to build it:
    # interpolate the variables (you have to define your # variables to interpolate first so they work). my $regex="^$wday $mon\s+$mday \d{2}:\d{2}:\d{2} $year:"; $regex=qr($regex); # compile the regular expression
    If the date you are looking for can change, you could create a simple subroutine that will take the weekday, month, year, and day parameters and return the compiled regular expression. That would be like this:
    my $s='Tue Feb 8 11:11:11 2005: blah blah blah'; my $regex=createRegEx('Tue','Feb','8','2005'); findstring($s); sub findstring($) { my $line=shift; print "found\n" if $line=~/$regex/; } sub createRegEx($$$$) { my ($wday,$mon,$mday,$year)=@_; my $regex="^$wday $mon\s+$mday \d{2}:\d{2}:\d{2} $year:"; return(qr($regex)); }