Re^2: Extracting Unique Characters from a Array

Yeah, let's turn a single O(n) scan with a hash (the right way to do it) into an O(n log n) sort followed by another O(n) scan. Aside from the added inefficiency you've lost the original ordering if that needed to be preserved.

Update: Ah, a followup did note that the list was to be sorted. Still makes more sense to do the O(n) cull of duplicates first and then the O(n log n) sort of the smaller list.

When in doubt:

#!/usr/bin/perl
use Benchmark qw( timethese cmpthese );

use constant SIZE  => 6_000;
use constant COUNT => 500;

my $count = shift || COUNT;
my $size  = shift || SIZE;

my @source = map { int( rand($size) ) } 1 .. $size;

cmpthese(
    $count,
    {
        sort_first => sub {
            my @sorted = sort @source;
            my $le     = undef;
            my @uniq;
            for (@sorted) {
                if ( $le != $_ ) {
                    $le = $_;
                    push @uniq, $_;
                }
            }
        },
        cull_first => sub {
            my %seen;
            my @uniq = grep { !$seen{$_}++ } @source;
            my @sorted = sort @uniq;
          }
    }
);

exit 0;

__END__
[download]

The cull_first version is from 8-20% faster for lists of from 5 to several thousand items.

Comment on Re^2: Extracting Unique Characters from a Array Select or Download Code