comment on

Hi All,

I was looking at some code (to get unique AOA) which I though was not very efficient. A simplified version of this code is given below.
(Please note that only index 1 and 3 need to be unique, also the result should be in same order)

use strict;
use warnings;
use Data::Dumper;

# Only the combination of index 1 and 3 needs to be unique.
my $AOA = [
    [
        1,
        219.00,  # Need to check this element for unique
        'ABC',
        'Abcdefghijklmnopqrstuvwxyz',  # Need to check this element fo
+r unique
        'def',
        '',
    ],
    [
        2,
        219.00,
        'ABC',
        'Abcdefghijklmnopqrstuvwxyz',
        'un',
        'we',
    ],
    [
        3,
        209.20,
        'ABC',
        'AbcdefghijklmnopqrstuvwxyzAbcdefghijk',
        'udf',
        '',
    ],
    [
        4,
        209.20,
        'ABC',
        'Abcdefghijklmnopqrstuvwxyz',
        'und',
        '',
    ],
];

my $max = $#$AOA;
my @matched_indexes;
my $count = 0;
for (my $j=0;$j <= $max ;$j++) {
    for (my $i=$j+1;$i <= $max;$i++) {
        if(($AOA->[$i][1] == $AOA->[$j][1]) && ($AOA->[$i][3] eq $AOA-
+>[$j][3])) {
            $matched_indexes[$count] = $i;
            $count++;
        }
    }
}

foreach my $index(@matched_indexes) {
    $AOA->[$index][1] = undef;
    $AOA->[$index][3] = undef;
}

my @new_AOA;
#Deleting the array elements if undef
for(my $index=0;$index<=$max;$index++) {
    if ((defined $AOA->[$index][1]) && (defined $AOA->[$index][3])) {
        push(@new_AOA, $AOA->[$index]);
    }
}
# Duplicate records removed
print Dumper \@new_AOA;
# Expected Output
#$VAR1 = [
#          [
#            1,
#            219,
#            'ABC',
#            'Abcdefghijklmnopqrstuvwxyz',
#            'def',
#            ''
#          ],
#          [
#            3,
#            '209.2',
#            'ABC',
#            'AbcdefghijklmnopqrstuvwxyzAbcdefghijk',
#            'udf',
#            ''
#          ],
#          [
#            4,
#            '209.2',
#            'ABC',
#            'Abcdefghijklmnopqrstuvwxyz',
#            'und',
#            ''
#          ]
#        ];
[download]

I tired a few solutions to make it more efficient, the best I could think was:

my %HOA;
my $count = 0;
foreach my $A (@$AOA) {
    my $key = $A->[1] . '-' . $A->[3];
    if (!defined $HOA{$key}) {
        $HOA{$key} = [$A, $count++];
    }
}

# Records need to be in the same sequence
my @new_AOA = sort { $a->[1] <=> $b->[1] }values %HOA;
@new_AOA = map {$_->[0]} @new_AOA;
print Dumper \@new_AOA;
[download]

Above solution gets rid of nested loop, though added new loops in form of sort and map (Which I need to maintain the same sequence as in original AOA).
Other thing which I could see is the length of keys of hash as the index 3 can have very large string. I am not sure if large keys could be any issue.

Please suggest any improvements to the above code.
Thanks in advance.

Update:Fixed some typo

Regards,
Ashish

In reply to Optimize code | remove duplicates from AOA. by ashish.kvarma

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.