dragonchild has asked for the wisdom of the Perl Monks concerning the following question:

I've got a situation where I want to remove all X's from a string, except if it is either the first character or is directly after a Y, and Y's are guaranteed (through a prior regex) to only be in the first position, should they exist. So, I built the following regex:
$string =~ s/(?<!^Y?)X//g;

And, I get the following error:

Variable length lookbehind not implemented in regex; marked by <-- HER +E in m/(?<!^Y?)X <-- HERE / at Some/File.pm line ###

Any suggestions on either how to get a variable-width negative lookbehind or to accomplish what I'm trying to do another way?

------
We are the carpenters and bricklayers of the Information Age.

Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

I shouldn't have to say this, but any code, unless otherwise stated, is untested

Replies are listed 'Best First'.
Re: Variable-width negative lookbehind
by japhy (Canon) on May 05, 2004 at 15:20 UTC
    Here are a handful of ways:
    1. Break the look-behind into two look-behinds. Make sure it's not preceded by a Y, and then make sure it's not preceded by the beginning of the string:
      $string =~ s/(?<!Y)(?<!^)X//g;
    2. Reverse the string and reverse the sense of the regex. Match an X not followed by an optional Y and then the end of the string:
      my $rstr = reverse $string; $rstr =~ s/X(?!Y|\Z)//g; # note \Z, not $ $string = reverse $rstr;
    3. Use my Regexp::Keep module (which I hope can be refactored to a standard regex assertion). It provides an "anchor", \K, which saves you from having to replace what you've matched with what you've matched. You'll see the difference here:
      # old: # $string =~ s/([^Y])X/$1/g; # new: $string =~ s/[^Y]\KX//;
    Regexp::Keep's \K anchor basically resets where Perl thinks it has started matching. See its documentation for more explanation.
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
      Regarding #3:
      While the example you give is easily translated into a (non-variable-width) lookbehind ( s/(?<=[^Y])X+//), the more general case of your
      s/$regex1\K$regex2//;
      can be achieved with
      /$regex1/g and s/\G$regex2//;
      (except for side effects of capturing parentheses).

      The PerlMonk tr/// Advocate
        Well, yes, the constant-width part is a big requirement for using a lookbehind, which is why I devised this method. And I'd expect one regex to be faster than two. But you also bring up the capturing parentheses, which are also an advantage of the one-regex method.
        _____________________________________________________
        Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
        s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Re: Variable-width negative lookbehind
by Zaxo (Archbishop) on May 05, 2004 at 19:52 UTC

    If the first character is 'Y', you want to remove all X's from index 2 on, otherwise remove them from index 1 on. That's a job for tr///d, and substr:

    substr( $string, substr($string,0,1) eq 'Y' ? 2 : 1) =~ tr/X//d;
    TIMTOWTDI.

    Update: Or more succinctly, with $_, substr( $_, /^Y/ ? 2 : 1) =~ tr/X//d;

    After Compline,
    Zaxo

      ..or for a quick match, substr, and tr///d:
      $string =~ /^Y?X?/; substr($string, $+[0]) =~ tr/X//d;

      The PerlMonk tr/// Advocate
      Très élégant.
      _____________________________________________________
      Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Re: Variable-width negative lookbehind
by hv (Prior) on May 05, 2004 at 14:12 UTC

    If I correctly understand the problem, you can do this using either:

    $string =~ s/([^Y])X+/$1/g;
    or
    $string =~ s/([^YX])X+/$1/g;
    depending on whether "YXX" should yield "YX" or be left unchanged.

    Hugo

      Some test cases:
      • XAA => XAA
      • YXAA => YXAA
      • YXAAXAA => YXAAAA
      • XAAXAA => XAAAA
      • VXAAXAA => VAAAA
      • AA => AA
      • YAA => YAA
      • YXX => YX

      ------
      We are the carpenters and bricklayers of the Information Age.

      Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

      I shouldn't have to say this, but any code, unless otherwise stated, is untested

        Crap! This almost does it, but I have to run to work. Maybe this will give you some ideas?

        #!/usr/local/bin/perl -wl use strict; use Test::More qw'no_plan'; + my %strings = ( XAA => 'XAA', YXAA => 'YXAA', YXAAXAA => 'YXAA', XAAXAA => 'XAAAA', VXAAXAA => 'VAAAA', AA => 'AA', YAA => 'YAA', YXX => 'YX', ); + while (my ($orig_string, $result) = each %strings) { my $string = $orig_string; $string = scalar reverse $string; # last or directly before a Y $string =~ s/X(?!Y|\Z)//g; $string = scalar reverse $string; is($string, $result, "$orig_string => $result"); }

        Cheers,
        Ovid

        New address of my CGI Course.

Re: Variable-width negative lookbehind
by matija (Priest) on May 05, 2004 at 14:11 UTC
      Or, translated into lookbehind notation (and moving the + for more efficiency):
      $string =~ s/(?<=[^Y])X+//g;
      As long as there's a non-Y character preceding, remove any string of Xs.

      The PerlMonk tr/// Advocate
Re: Variable-width negative lookbehind
by delirium (Chaplain) on May 05, 2004 at 17:09 UTC
    How about the lovely and talented /e?

    $string =~ s/(.)X+/$1 eq 'Y' ? YX : $1/eg;