Hi All,

I was looking at some code (to get unique AOA) which I though was not very efficient. A simplified version of this code is given below.
(Please note that only index 1 and 3 need to be unique, also the result should be in same order)

use strict; use warnings; use Data::Dumper; # Only the combination of index 1 and 3 needs to be unique. my $AOA = [ [ 1, 219.00, # Need to check this element for unique 'ABC', 'Abcdefghijklmnopqrstuvwxyz', # Need to check this element fo +r unique 'def', '', ], [ 2, 219.00, 'ABC', 'Abcdefghijklmnopqrstuvwxyz', 'un', 'we', ], [ 3, 209.20, 'ABC', 'AbcdefghijklmnopqrstuvwxyzAbcdefghijk', 'udf', '', ], [ 4, 209.20, 'ABC', 'Abcdefghijklmnopqrstuvwxyz', 'und', '', ], ]; my $max = $#$AOA; my @matched_indexes; my $count = 0; for (my $j=0;$j <= $max ;$j++) { for (my $i=$j+1;$i <= $max;$i++) { if(($AOA->[$i][1] == $AOA->[$j][1]) && ($AOA->[$i][3] eq $AOA- +>[$j][3])) { $matched_indexes[$count] = $i; $count++; } } } foreach my $index(@matched_indexes) { $AOA->[$index][1] = undef; $AOA->[$index][3] = undef; } my @new_AOA; #Deleting the array elements if undef for(my $index=0;$index<=$max;$index++) { if ((defined $AOA->[$index][1]) && (defined $AOA->[$index][3])) { push(@new_AOA, $AOA->[$index]); } } # Duplicate records removed print Dumper \@new_AOA; # Expected Output #$VAR1 = [ # [ # 1, # 219, # 'ABC', # 'Abcdefghijklmnopqrstuvwxyz', # 'def', # '' # ], # [ # 3, # '209.2', # 'ABC', # 'AbcdefghijklmnopqrstuvwxyzAbcdefghijk', # 'udf', # '' # ], # [ # 4, # '209.2', # 'ABC', # 'Abcdefghijklmnopqrstuvwxyz', # 'und', # '' # ] # ];

I tired a few solutions to make it more efficient, the best I could think was:

my %HOA; my $count = 0; foreach my $A (@$AOA) { my $key = $A->[1] . '-' . $A->[3]; if (!defined $HOA{$key}) { $HOA{$key} = [$A, $count++]; } } # Records need to be in the same sequence my @new_AOA = sort { $a->[1] <=> $b->[1] }values %HOA; @new_AOA = map {$_->[0]} @new_AOA; print Dumper \@new_AOA;

Above solution gets rid of nested loop, though added new loops in form of sort and map (Which I need to maintain the same sequence as in original AOA).
Other thing which I could see is the length of keys of hash as the index 3 can have very large string. I am not sure if large keys could be any issue.

Please suggest any improvements to the above code.
Thanks in advance.

Update:Fixed some typo

Regards,
Ashish

In reply to Optimize code | remove duplicates from AOA. by ashish.kvarma

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.