Using splice is very inefficient. It scales very poorly. The following is O(K*A1*A2*(A1+A2)):

for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; for (my $i1 = @$a1; $i1-- >= 0; ) { for (my $i2 = @$a2; $i2-- >= 0; ) { next if $a1->[$i] ne $a2->[$i2]; splice( @$a1, $i1, 1 ); splice( @$a2, $i2, 1 ); ... } }

If you keep track of the indexes to delete and delete them later, you can greatly improve the scalability. The following is O(K*A1*A2).

for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; my %delete_a1; my %delete_a2; OUTER: for $i1 ( 0 .. $#$a1 ) { #next if $delete_a1{$i1}; # Never true. for $i2 ( 0 .. $#$a2 ) { next if $delete_a2{$i2}; next if $a1->[$i1] ne $a2->[$i2]; $delete_a1{$i1} = 1; $delete_a2{$i2} = 1; next OUTER; } } @$a1 = map $a1->[$_], grep !$delete_a1{$_}, 0..$#$a1; @$a2 = map $a2->[$_], grep !$delete_a2{$_}, 0..$#$a2; }

You can even do better at the cost of readability. The following is O(K*(A1+A2)).

for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; my %seen; for ( @$a1 ) { push @{ $h1{$a1} }, $_; } my %delete_a1; my %delete_a2; for ( 0..$#$a2 } ) { my $seen = $seen{ $a2->[$_] }; if ( $seen && @$seen ) { $delete_a1{$_} = shift(@$seen); $delete_a2{$_} = 1; } } @$a1 = map $a1->[$_], grep !$delete_a1{$_}, 0..$#$a1; @$a2 = map $a2->[$_], grep !$delete_a2{$_}, 0..$#$a2; }

All of these are fairly complicated because I didn't assume @$a1 and @$a2 each contained only unique elements. If @$a1 contains no duplicates, and if @$a2 contains no duplicates, you could use a simple set difference.

for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; my %seen; ++$seen{$_} for @$a1, @$a2; @$a1 = grep $seen{$_}>1, @$a1; @$a2 = grep $seen{$_}>1, @$a2; }

In reply to Re: Accessing (deleting) array elements in a hash of arrays (scalability) by ikegami
in thread Accessing (deleting) array elements in a hash of arrays by onslaught

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.