Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Is this a job for a regular expression? I want to match a character followed by anything else except for that first character. I've tried it with backtraces, but that obviously not possible

Example (fake code):
$str = '...+...'; $str =~ /(.)[^\1]/; # And now I had in mind to have $& containing '.+'
/L

Replies are listed 'Best First'.
Re: Anything else is fine!
by MidLifeXis (Monsignor) on Feb 08, 2010 at 17:23 UTC
    while (<>) { chomp; my @values = (m/ (.) # One character (?!\1) # not followed by that character (?=.) # but followed by a character /gx); print join(",", @values), "\n"; }

    This uses zero length lookaheads to test first that the same character does not follow, and then also that at least one character follows.

    Update: Removed needless $_ =~.

    It is said that "only perl can parse Perl." I don't even come close until my 3rd cup of coffee. --MidLifeXis

Re: Anything else is fine!
by kennethk (Abbot) on Feb 08, 2010 at 17:22 UTC
    Assuming I understand your spec, this can be accomplished using a combination of positive look aheads combined with Backreferences.

    #!/usr/bin/perl use strict; use warnings; my @bad_strings = qw(ee); my @good_strings = qw(ef); foreach (@bad_strings) { print "$_ failed\n" if /(.)(?!\1|$)/; } foreach (@good_strings) { print "$_ failed\n" if not /(.)(?!\1|$)/; }

    The regular expression works as follows:

    1. Any character is matched and captured into reference 1 ((.))
    2. A negative look ahead ((?!...)) then checks if the next character is either the matched character (\1) or (|) the end of the string ($). If either matches, the expression fails.

    Update: Just noticed the bit about $&. The above code will store '.' in $& for your example. If you want to include both characters, you can append an additional '.' to the end of the regular expression, but note that this will consume two characters and hence removes the opportunity to also match '+.' for your sample. Rather than using $&, you may consider using $-[0] (see @ ) combined with substr. If you are only interested in the first match, you may want to wrap the entire expression in parentheses as per BrowserUK's suggestion below.

Re: Anything else is fine!
by BrowserUk (Patriarch) on Feb 08, 2010 at 17:30 UTC
    $s = 'aaabacadaabbbccccd'; print $1 while $s=~m[((.)(?!\2).)]g;; ab ac ad ab bc cd

      You are missing: at least ba, ca, da (at least according to my by-hand count). You need the zero-lookahead on the following character to keep the regexp engine from gobbling up the second character.

      Update: This could also be a parsing error of the OP on my part. This could also be the intent of the OP. Would the Anonymous Monk care to elaborate?

      It is said that "only perl can parse Perl." I don't even come close until my 3rd cup of coffee. --MidLifeXis

        BrowserUk's regex is easily modified to provide overlapping matches:

        >perl -wMstrict -le "my $s = 'aaabacadaabbbccccd'; print qq{'$1'} while $s =~ m[ (?= ((.) (?! \2) . ) ) ]xmsg;; " 'ab' 'ba' 'ac' 'ca' 'ad' 'da' 'ab' 'bc' 'cd'
Re: Anything else is fine!
by Anonymous Monk on Feb 08, 2010 at 17:43 UTC
    Ahh, that were some quick responses!

    Thank you very very much.

    /L
Re: Anything else is fine!
by repellent (Priest) on Feb 10, 2010 at 01:58 UTC
    Non-regex:
    my $str = "xabcabcababca"; my $is_good = index(substr($str, 1), substr($str, 0, 1)) < 0;