memnoch has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I'm wondering if anyone can suggest an elegant solution to my regex substitution question. In the following:

my $greekReplace = "alpha|beta|chi|delta|epsilon|eta|gamma|hbar|kappa| +lambda|mu|nu|omega|phi|pi|psi|rho|sigma|tau|theta"; my $string = '3*mu'; $string =~ s/($greekReplace)/\\$1 /isg;
$string would end up as '3*\mu ' (note trailing space). This all works fine.

The issue is that I need to be able to NOT do the 'mu' substitution when 'mu' is prepended with 'a', as in 'amu'.

While I'm sure I can figure out a clunky way of doing the substitution with multiple levels of parentheses and memory variables, I'm wondering if there is a simple solution. I tried using 'a{0}mu' in the above string, but that doesn't work.

Here is some code I have used to test things I've tried:

#!/usr/bin/perl use strict; use warnings; my $greekReplace = "alpha|beta|chi|delta|epsilon|eta|gamma|hbar|kappa| +lambda|mu|nu|omega|phi|pi|psi|rho|sigma|tau|theta"; my %strings = ( mu => { valid => '\mu ' }, amu => { valid => 'amu' }, bmu => { valid => 'b\mu ' }, cmu => { valid => 'c\mu ' }, mud => { valid => '\mu d' }, '3*mu' => { valid => '3*\mu ' }, ); foreach my $string (sort keys %strings) { print "Before: '$string'; "; my $string2 = $string; $string2 =~ s/($greekReplace)/\\$1 /isg; print "After: '$string2' ; Should be: $strings{$string}->{valid}; +"; print $string2 eq $strings{$string}->{valid} ? "Passes\n" : "FAILS +\n"; } print "\n";

Thanks

Update: Thanks Monks....the negative look-behind does the trick!

Replies are listed 'Best First'.
Re: Regex question
by kennethk (Abbot) on May 18, 2010 at 13:48 UTC
    I note that you want to swap bmu but not amu. Given the conditional nature, you can use negative look-behind assertions to cause the match to fail if the matching string is preceded by an 'a':

    $string2 =~ s/(?<!a)($greekReplace)/\\$1 /isg;

    More info can be found in perlre and perlretut. If you want to encode abeta but not amu, you should swap your 'mu' entry in $greekReplace to (?<!a)mu.

Re: Regex question
by pokki (Monk) on May 18, 2010 at 13:52 UTC

    If you are trying to match on 'mu' not preceded by 'a', you should be able to use a negative look-behind assertion: /(?<!a)mu/

    If you are trying to match on 'mu' on its own, not in the middle of a word, use boundaries : /\bmu\b/

Re: Regex question
by moritz (Cardinal) on May 18, 2010 at 13:46 UTC
    You probably just want a word boundary assertion, \b
    $string =~ s/\b($greekReplace)\b/\\$1 /isg;

    Note that \b doesn't think 3mu has a word boundary between the 3 and mu (because both match \w). If that's a problem, you need to work with the look-around assertions described in perlre.

    Update: I should have read the question more carefully, it seems a negative look-behing would be more appropriate:

    s/(?<!a)($greekReplace)/\\$1 /isg;
Re: Regex question
by JavaFan (Canon) on May 18, 2010 at 13:50 UTC
    Surround your pattern with \b anchors.