ckeith100 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm seeing strange behaviour when the pattern for a s/// is an empty string and I was wondering if someone could explain why this behaviour occurs? (man pages are fine too!) Thanks.

If I use an empty pattern for a substitute op on its own, then no values are replaced. For example:

use strict; use warnings; my $msg = 'this is a test'; my $pat = ''; my $rep = ''; $msg =~ s/$pat/$rep/; print $msg, "\n";

Running this gives me what I expect:

this is a test

However, it seems that if I use a substitute prior to this substitute then it uses the pattern of the first operator - even if it acts on a different variable. For example:

use strict; use warnings; my $c = 'this is another test'; $c =~ s/^this is/XX/; my $msg = 'this is a test'; my $pat = ''; my $rep = ''; $msg =~ s/$pat/$rep/; print $msg, "\n";

Then the output becomes:

a test

As far as I can see the empty pattern causes the substitution operator to use the previous pattern, almost as though it is caching it. It only uses the pattern too, the value is not re-used, and further it seems that it only occurs if the substitution is successful. A failed match doesn't trigger this.

This is rather surprising behaviour, and quiet alarming too. I discovered this because one of my customers reported a problem with an application we produce. I narrowed it down to this, but I haven't been able to find any description about this activity. My assumption would be that it would either does nothing, because it's looking for a 0-width string and fails to find it, or that it applies to every part of the string because it is a 0-width pattern which matches between every character in the string (sort of like a \b). That it reuses an old pattern, and a pattern applied to a totally separate variable, is very odd behaviour. I'm not sure if this is considered "undefined behaviour", but it makes me concerned for any code that I've written where the substitution pattern comes from a scalar rather than being hard coded.

Any suggestions on why this is happening (or where it is defined that this happens) are greatly appreciated. I'm using:

perl -v This is perl, v5.8.8 built for x86_64-linux-thread-multi-ld

On an Fedore Core 6 x86 2x dual core opteron box with Perl compiled from source against standard FC6 RPM libraries

Thanks, Colin.

Replies are listed 'Best First'.
Re: s/// with an empty pattern uses the previous pattern of a s///
by ikegami (Patriarch) on May 17, 2007 at 15:47 UTC

    Yes, this is documented in perlop:

    If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead. In this case, only the g and c flags on the empty pattern is honoured - the other flags are taken from the original pattern. If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match).

    If you want to $pat = ''; /$pat/ to match only empty strings, then add $pat = qr/^\z/ if not length($pat);
    If you want to $pat = ''; /$pat/ to match everything, then add $pat = qr/.*/ if not length($pat);

      Oh! Another to fix the problem is to add a no-op to the regexp.

      $\ = "\n"; $pat = 'b'; $_='abc'; s/$pat/!/g; print; $pat = ''; $_='abc'; s/$pat/@/g; print; $pat = ''; $_='abc'; s/(?:$pat)/#/g; print;

      outputs

      a!c a@c #a#b#c#

        Thank you for these tips. I'm not looking to replace the empty scalar, I'm looking to replace whatever is in that scalar with the value of a different scalar. Those values are determined dynamically so the best answer is to use, as you suggested, something like:

        $pat && ($msg =~ s/$pat/$replace/mg);

        I was doing some more searching and discovered this exact quote in the perlop man page, as you said in reply #1, however in my copy of the man page it is listed in the m// operator rather than the substitute operator I wasn't aware that it applied there also. It does make sense that it would do, but I have to note this to try to scrape together a little dignity after having just posted a question on something in the man pages like a newbie coder :)

        Thanks again for your help! Colin