Here's a non regexp solution. I'm assuming that the 'separator' between the words is unimportant. This will also remove duplicates that are not 'next to' each other within the string. Ie. The regexps here won't remove the second alpha from "alpha beta alpha".
my $string = "alpha beta beta gamma gamma gamma foo bar bar baz qux +qux qux"; # preseving the order my $i; my %h = map { $_ => $i++ } split(/[\b\s]+/,$string); print join(" ", sort { $h{$a} <=> $h{$b} } keys %h); print "\n"; # destroying the order my %h = map { $_ => 1 } split(/[\b\s]+/,$string); print join(" ",keys %h); print "\n";
Yields:
alpha beta gamma foo bar baz qux foo gamma baz bar beta alpha qux

/\/\averick
perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"


In reply to Re: most efficient regex to delete duplicate words by maverick
in thread most efficient regex to delete duplicate words by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.