Re: removal of dupes using a hash

This isn't the model of efficiency but you could try...

@data_out = uniq(@sorted_data);

sub uniq
{
    my @xs=@_; my $x=shift @xs;
    return $x unless @xs;
    return (uniq(@xs)) if grep {substr($x,0,7) eq substr($_,0,7)} @xs;
    return ($x,uniq(@xs));
}
[download]

...or a semi-brute force method like...

foreach (reverse @sorted_data) {
    unshift(@data_out, $_) unless ($seen{substr($_,0,7)}++);
}
[download]

Update: Fixed a bug in uniq (it added a spurious undef to the end of the array) Here's another subroutine I'm a little more fond of...

@data_out = nub(@sorted_data);

sub nub
{
    my @xs=@_; my $x=pop @xs;
    return $x unless @xs;
    return (nub(grep {substr($x,0,7) ne substr($_,0,7)} @xs), $x);
}
[download]

Comment on Re: removal of dupes using a hash Select or Download Code

Replies are listed 'Best First'.
Re: Re: removal of dupes using a hash by sleepingsquirrel (Chaplain) on May 20, 2004 at 23:15 UTC
...better yet... `while($_=pop @sorted_data) { unshift(@data_out,$_) unless ($seen{substr($_,0,7)}++); }` [download]	[reply] [d/l]