in reply to Regex Grumblings (Variable Interpolation)

Try the same with $foo="$bar";. The thing is, single quotes will get the characters literally ($,b,a,r) but doubles will interpolate (w,h,a,t,e,v,e,r,_,b,a,r,_,w,a,s).

A regex reads a '$' as the end of the string or a variable to interpolate. However, with $foo you just get the literal ($,b,a,r) back, no matter if you use \Q or not. So:

$_='Some string with $foo and bar'; $bar='bar'; $foo='$bar'; /$bar/ and print "$&\n"; /$foo/ and print "$&\n"; /\Q$foo\E/ and print "$\n";
Just prints one 'bar'. Take a look at perlre and perlop.

Cheers,

Jeroen
"We are not alone"(FZ)

Replies are listed 'Best First'.
Re^2: Regex Grumblings (Variable Interpolation)
by tadman (Prior) on May 23, 2001 at 15:26 UTC
    This is precisely why I made some test code to explore this, quite similar, in fact:
    $foo = '$bar'; $bar = 'snafu'; $_ = 'I am certain that the value of $bar is "snafu".'; print $_,"\n"; s/$foo/BAR/g; print $_,"\n";
    However, it doesn't interpolate '$bar' into anything meaningful, and as such, 'snafu' does not get replaced as one might surmise.

      Of course it dosen't. $ within a regex means End-Of-Line. And trying to match something (nonempty) after the end of the line is not successful within a single line match.

      (But I have to admit, I had to run Perl for this and then stare at the output for some time)

      Update: I can't confirm jeroenes' findings with Perl 5.003 under solaris. I only get one bar printed and no substitution.

        $ within a regex means end of line, unless it could be interpreted as a variable. It has been this way for a while, as Perl4 wasn't very astute in this regard, and often needed to be coached, such as s/${foo}/XYZ/g.

        In my initial example, s/$bar/XYZ/g operates as expected, replacing instances of '$bar' with 'XYZ'. Further, as larryk pointed out, s/$foo/BAR/g should resolve to s/$bar/BAR/g given that $foo is '$bar', but this interpolated result is treated more literally somehow than if you had just put that very code in there in the first place, or had eval'd it as such.

        I am convinced this is an inconsistency, or perhaps, a peculiar feature of the regular expression compiler. The "intelligence" that Perl demonstrates in the initial compilation does not apply to the post-interpolation compilation phase, to put it more technically.
      But even better, than I tried some deliberate typos, to check for funny things. And guess what, I found something funny!
      $_='Some string with $foo and bar'; $bar='bar'; $foo='$bar'; /$bar/ and print "$&\n"; /$foo/ and print "$&\n"; /\Q$fooE/ and print "$&\n"; s/\Q$fooE/XYZ/; print "$_\n";
      Prints bar twice, and replaces bar by XYZ! This is definitely very strange... is this a bug or what?

      Jeroen
      "We are not alone"(FZ)
      Update: This was run with perl5.6/linux: "This is perl, v5.6.0 built for i386-linux "
      (2) grinder I made this typo on purpose.
      (3) Thx grinder, things are as they should be now :-)

        Don't you mean

        /\Q$foo\E/ and print "$&\n"; s/\Q$foo\E/XYZ/;

        Using -w would have helped you figure it out.

        Update: oh right, deliberate tyops. Well, now that I've really thought about it... this is what is going on. The line

        /\Q$fooE/ and print "$&\n";

        given that $fooE is not defined is the same thing as

        /\Q/ and print "$&\n";

        which is the same thing as

        // and print "$&\n";

        Which is the same thing as finding the same thing as the last successful match. This also applies to the s expression, which is why your bar appears to be magically transmuting itself into XYZ.

        That's not a bug, that's a feature.


        --
        g r i n d e r
      You are right, Perl's regexes do undergo variable interpolation but it's only done once, e.g. $bar becomes the value it holds and $foo becomes $bar which doesn't then become $bar's value.

      The reason why neither of your bits of code work (assuming you are using the same $_ value for both) is because as $foo is interpolated to $bar the regex becomes s/$bar/BAR/g which is $ (the end of line) followed by the characters 'bar'.

      I expect there is a way to fiddle with the end of line character and then do a s/$foo/BAR/m #treat string as multi-line (or maybe it's s/$foo/BAR/s # treat string as single-line - I can never remember) to get it to match your $_ if you took the $ out (ie. if the \b before 'bar' somehow became an end of line) but I'll have to open that one up as it's over my head. (where's japhy when we need him?)

      The reason \Q$foo\E works is because after $foo is interpolated to $bar the \Q\E slaps a \ in front of the $ so it is treated literally rather than as the end-of-line marker.

      You will find that your second lot of code will work if you do:

      $foo = '\$bar'; # put the backslash in yourself $bar = 'snafu'; $_ = 'I am certain that the value of $bar is "snafu".'; print $_,"\n"; s/$foo/BAR/g; print $_,"\n";
      Hope this helps, larryk