in reply to Regex help

You want capture and lookahead, print $2, $/ while /(\b\w+\b)(?=(.*?)\1)/g; \1 there is the form of backreference needed in regexen.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re: Re: Regex help
by bmann (Priest) on Jan 30, 2004 at 00:01 UTC
    Very cool, but the output of print $2, $/ while /(\b\w+\b)(?=(.*?)\1)/g; is:

    brown fox jumped over the other fox jumped over the other quick jumped over the other quick brown o #????
    In order to pass over the "o" (from "the o_the_r") the backreference in the lookahead needs to be anchored to word boundaries, ie:

    print $2, $/ while /(\b\w+\b)(?=(.*?)\b\1\b)/g;

    Not to mention, what is the desired output if the string is "the one the two the"?

Re: Re: Regex help
by ysth (Canon) on Jan 30, 2004 at 03:11 UTC
    I can't believe you missed out on this opportunity to wave (--) the magic wand variable ($|) to get only the odd or even elements of a list:
    print grep --$|,($|||=1)&& /\b(\w+)\b(?=(.*?)\b\1\b)/g'
      What the good lord does that code do? Or rather, how does it work? Whats the $|||=1 bit, etc?
        That bit is just charging up the magic wand. It can just as well be a separate statement:
        $| = 1; print grep --$|, /\b(\w+)\b(?=(.*?)\b\1\b)/g'
        To get even elements instead of odd elements, you just start with the magic wand in the other position:
        print grep --$|,($|&&=0)|| /\b(\w+)\b(?=(.*?)\b\1\b)/g'
        or:
        $| = 0; print grep --$|, /\b(\w+)\b(?=(.*?)\b\1\b)/g'
        Why ||= and &&= instead of = is left as an excercise for the reader. Hope that clears things up for you :)