remove duplicates

cruelty has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: remove duplicates by blakem (Monsignor) on Oct 11, 2002 at 07:56 UTC
How about: `1 while $text =~ s/(\b(\w+)\b.*)\b\2\b\s?/$1/s;` [download] -Blake	[reply] [d/l]
Re: Re: remove duplicates by cruelty (Novice) on Oct 11, 2002 at 08:06 UTC
absolutely brilliant ! Thanks.	[reply]
Re: remove duplicates by joe++ (Friar) on Oct 11, 2002 at 08:05 UTC
Maybe you should be more explicit with your requirements. For instance, do you want to keep the order of the (first-) occurrences of each word? If not, doing a split on whitespace, using each list element as a hash key and flattening the result with keys() will give you the list of unique words. `my $string = "foo bar ..."; my %L; map($L{$_}++, split(/\s/, $string)); # bonus: each hash element has the number of occurrences as value. $string = join(' ', keys(%L));` [download] HTH! note: this is untested... -- Cheers, Joe	[reply] [d/l]
Re2: remove duplicates by blakem (Monsignor) on Oct 11, 2002 at 08:14 UTC
If you're going to split and use a hash to check for uniqueness, its still possible to retain the original order (first occurances only, of course) `my %seen; $text = join(' ', grep !$seen{$_}++, split(' ',$text));` [download] Though, this removes all the newlines as well.... -Blake	[reply] [d/l]
Re: remove duplicates by Abigail-II (Bishop) on Oct 11, 2002 at 11:02 UTC
Could you be a bit more specific? From your example, it's clear you want to remove duplicate "word"s, but to me it's not clear which whitespace should be kept. There's a newline between bar and foo, and you want to keep the newline, even while you want the bar and foo removed. (All code given so far in this thread fail to do so). What are your requirements for whitespace retention? A single example is just too vague. Abigail	[reply]
Re: remove duplicates by mce (Curate) on Oct 11, 2002 at 12:26 UTC
Hi, This question pops up once and a while. It is mostly referred to as:"how to find unique elements in an array", as you can easily split you string on " "'s. I was wondering, why isn't there a unique method in perl? This seems quite usefull. Anyway, I defined mostly a UNIVERSAL method like this `sub unique { my $self=shift; my %tmp; map { %tmp{$_}=1 } @_; return keys %tmp; }` [download] and just call it like : `$o->unique(@array); # with $o my blessed object` [download] But, anyway, since blakem and joe++ gave perfectly good answers, I rest my case :-) --------------------------- Dr. Mark Ceulemans Senior Consultant IT Masters, Belgium	[reply] [d/l] [select]