in reply to most efficient regex to delete duplicate words

Here's one way.
$_="alpha beta beta gamma gamma gamma"; while (s/((\w+)\s\2)/$2/) {}; print $_;
the second line does something that may not be obvious to everyone, and seems to duplicate /g 's functionality. However, since you've (seemingly) got 3 gamma in a row, writing
$_="alpha beta beta gamma gamma gamma"; s/((\w+)\s\2)/$2/g; print $_;
Will leave you with an extra gamma. Using the 'useless' while loop allows the regex to check for multiple duplicates.

As for the regex you tried $string =~ s/(\w+)(.*)\b\1/$1 $2/sig;, we have :

Replies are listed 'Best First'.
Re: Re: most efficient regex to delete duplicate words (boo)
by blakem (Monsignor) on Aug 14, 2001 at 02:37 UTC
    I always perfer:

    1 while (EXPR)

    instead of:

    while (EXPR) {}

    Simply because the '1 while' sticks out at the front where as the empty {} tends to get lost. I find that '1 while' immediately flags this perl idiom and makes it easier for me to pick it out.

    -Blake

      1 while (EXPR): will also avoid the slight over head of entering and exiting the lexical context created by the BLOCK in while (EXPR) {} Perl might be smart enough to optimise the out the empty block though.

      What's wrong with:

      "chill" while (EXPR);

      ???