in reply to Re: Regex Grumblings (Variable Interpolation)
in thread Regex Grumblings (Variable Interpolation)

This is precisely why I made some test code to explore this, quite similar, in fact:
$foo = '$bar'; $bar = 'snafu'; $_ = 'I am certain that the value of $bar is "snafu".'; print $_,"\n"; s/$foo/BAR/g; print $_,"\n";
However, it doesn't interpolate '$bar' into anything meaningful, and as such, 'snafu' does not get replaced as one might surmise.

Replies are listed 'Best First'.
Re: Re^2: Regex Grumblings (Variable Interpolation)
by Corion (Patriarch) on May 23, 2001 at 15:35 UTC

    Of course it dosen't. $ within a regex means End-Of-Line. And trying to match something (nonempty) after the end of the line is not successful within a single line match.

    (But I have to admit, I had to run Perl for this and then stare at the output for some time)

    Update: I can't confirm jeroenes' findings with Perl 5.003 under solaris. I only get one bar printed and no substitution.

      $ within a regex means end of line, unless it could be interpreted as a variable. It has been this way for a while, as Perl4 wasn't very astute in this regard, and often needed to be coached, such as s/${foo}/XYZ/g.

      In my initial example, s/$bar/XYZ/g operates as expected, replacing instances of '$bar' with 'XYZ'. Further, as larryk pointed out, s/$foo/BAR/g should resolve to s/$bar/BAR/g given that $foo is '$bar', but this interpolated result is treated more literally somehow than if you had just put that very code in there in the first place, or had eval'd it as such.

      I am convinced this is an inconsistency, or perhaps, a peculiar feature of the regular expression compiler. The "intelligence" that Perl demonstrates in the initial compilation does not apply to the post-interpolation compilation phase, to put it more technically.
        It's not peculiar. There are 2 steps. First the written RE is transformed in a value of 'type' RE like a written 2 is transformed into the number 2. This step includes the variables. But only once. Step 2: the match. No interpolation, all acording to the rules. Evidence in case: /$m/o . Study the o flag!
Re:{3} Regex Grumblings (Variable Interpolation)
by jeroenes (Priest) on May 23, 2001 at 15:35 UTC
    But even better, than I tried some deliberate typos, to check for funny things. And guess what, I found something funny!
    $_='Some string with $foo and bar'; $bar='bar'; $foo='$bar'; /$bar/ and print "$&\n"; /$foo/ and print "$&\n"; /\Q$fooE/ and print "$&\n"; s/\Q$fooE/XYZ/; print "$_\n";
    Prints bar twice, and replaces bar by XYZ! This is definitely very strange... is this a bug or what?

    Jeroen
    "We are not alone"(FZ)
    Update: This was run with perl5.6/linux: "This is perl, v5.6.0 built for i386-linux "
    (2) grinder I made this typo on purpose.
    (3) Thx grinder, things are as they should be now :-)

      Don't you mean

      /\Q$foo\E/ and print "$&\n"; s/\Q$foo\E/XYZ/;

      Using -w would have helped you figure it out.

      Update: oh right, deliberate tyops. Well, now that I've really thought about it... this is what is going on. The line

      /\Q$fooE/ and print "$&\n";

      given that $fooE is not defined is the same thing as

      /\Q/ and print "$&\n";

      which is the same thing as

      // and print "$&\n";

      Which is the same thing as finding the same thing as the last successful match. This also applies to the s expression, which is why your bar appears to be magically transmuting itself into XYZ.

      That's not a bug, that's a feature.


      --
      g r i n d e r
Re: Re^2: Regex Grumblings (Variable Interpolation)
by larryk (Friar) on May 23, 2001 at 16:17 UTC
    You are right, Perl's regexes do undergo variable interpolation but it's only done once, e.g. $bar becomes the value it holds and $foo becomes $bar which doesn't then become $bar's value.

    The reason why neither of your bits of code work (assuming you are using the same $_ value for both) is because as $foo is interpolated to $bar the regex becomes s/$bar/BAR/g which is $ (the end of line) followed by the characters 'bar'.

    I expect there is a way to fiddle with the end of line character and then do a s/$foo/BAR/m #treat string as multi-line (or maybe it's s/$foo/BAR/s # treat string as single-line - I can never remember) to get it to match your $_ if you took the $ out (ie. if the \b before 'bar' somehow became an end of line) but I'll have to open that one up as it's over my head. (where's japhy when we need him?)

    The reason \Q$foo\E works is because after $foo is interpolated to $bar the \Q\E slaps a \ in front of the $ so it is treated literally rather than as the end-of-line marker.

    You will find that your second lot of code will work if you do:

    $foo = '\$bar'; # put the backslash in yourself $bar = 'snafu'; $_ = 'I am certain that the value of $bar is "snafu".'; print $_,"\n"; s/$foo/BAR/g; print $_,"\n";
    Hope this helps, larryk