in reply to Optimize code | remove duplicates from AOA.

Step back.

Ask yourself if your code is actually too slow for your purpose. Only if the answer is "yes", continue.

Before you optimize, benchmark your code. After you did an optimization, benchmark again. Did it get faster? If not, revert.

After each optimization step, ask yourself if the code is now fast enough for your purposes. If yes, stop.

Use a profiler to find the slow parts of your program.


After this section on general optimization technique, I'll now propose a hash-based, on-pass solution. You tell me if it's faster or not. (I can't know, because I don't have real test data; four arrays aren't enough to do any serious performance analysis).

my %uniq; my @new_AOA; foreach my $A (@$AOA) { my $key = $A->[1] . '-' . $A->[3]; push @new_AOA, $A unless $uniq{$key}++ }

Using grep instead of the loop might or might not be faster.

Update: grep version:

my %uniq; my @new_AOA = grep { !( $uniq{$A->[1] . '-' . $A->[3]}++ ) } @AOA;

(Untested, but you get the idea).

And finally it's a good idea to name variables by what they contain, not by the structure of what they contain.

Perl 6 - links to (nearly) everything that is Perl 6.

Replies are listed 'Best First'.
Re^2: Optimize code | remove duplicates from AOA.
by jonc (Beadle) on Jul 09, 2011 at 16:37 UTC

    Not entirely sure, but (just for completeness) is your grep version supposed to sub the $A with $_ to eliminate the for loop and replace it with the grep?

    So final, quickest version:

    my %uniq; my @new_AOA = grep { !( $uniq{$_->[1] . '-' . $_->[3]}++ ) } @AOA;