in reply to removal of dupes using a hash

This isn't the model of efficiency but you could try...
@data_out = uniq(@sorted_data); sub uniq { my @xs=@_; my $x=shift @xs; return $x unless @xs; return (uniq(@xs)) if grep {substr($x,0,7) eq substr($_,0,7)} @xs; return ($x,uniq(@xs)); }
...or a semi-brute force method like...
foreach (reverse @sorted_data) { unshift(@data_out, $_) unless ($seen{substr($_,0,7)}++); }
Update: Fixed a bug in uniq (it added a spurious undef to the end of the array) Here's another subroutine I'm a little more fond of...
@data_out = nub(@sorted_data); sub nub { my @xs=@_; my $x=pop @xs; return $x unless @xs; return (nub(grep {substr($x,0,7) ne substr($_,0,7)} @xs), $x); }

Replies are listed 'Best First'.
Re: Re: removal of dupes using a hash
by sleepingsquirrel (Chaplain) on May 20, 2004 at 23:15 UTC
    ...better yet...
    while($_=pop @sorted_data) { unshift(@data_out,$_) unless ($seen{substr($_,0,7)}++); }