in reply to Optimize - Checking multiple occurences of elements in an array

An interesting note was made in the Chatterbox about how the idiomatic way of deleting duplicates won't work if you have both undef and the empty string ("") as values. This is because stringification treats them both the same.

It also won't work for an array of references, for the same reason. (It won't work for an ordered array for a different reason.)

In the spirit of fixing both issues, I offer the following:

sub unique (@) { my ($arr) = shift; my %x; for my $index (0 .. $#$arr) { my $val = $arr->[$index]; !defined($val) && do { $x{__NOT_DEFINED__} ||= [ $index, undef, ]; next; }; $x{$val} ||= [ $index, $val, ]; } map { $_->[1] } sort { $a->[0] <=> $b->[0] } values %x; }

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Replies are listed 'Best First'.
Re: A more complete Unique()
by Aristotle (Chancellor) on Jan 23, 2003 at 22:17 UTC
    Very complicated and breaks if a literal __NOT_DEFINED__ appears in the input data. Contrary to the posts so far, the standard idiom is not just using a hash, but combining it with grep:
    my %seen; my @unique = grep !$seen{$_}++, @array;
    This retains order and does not break references. It does have the problem with empty strings vs undefs cancelling each other away when they (probably) shouldn't, but that's easily fixed:
    my (%seen, $seen_undef); my @unique = grep defined ? !$seen{$_}++ : !$seen_undef++, @array;

    Makeshifts last the longest.