comment on

Using splice is very inefficient. It scales very poorly. The following is O(K*A1*A2*(A1+A2)):

for my $k ( keys %harry ) {
   my $a1 = $harry{$k};
   my $a2 = $harry{$phash{$k}};

   for (my $i1 = @$a1; $i1-- >= 0; ) {
   for (my $i2 = @$a2; $i2-- >= 0; ) {
      next if $a1->[$i] ne $a2->[$i2];
      splice( @$a1, $i1, 1 );
      splice( @$a2, $i2, 1 );
      ...
   }
}
[download]

If you keep track of the indexes to delete and delete them later, you can greatly improve the scalability. The following is O(K*A1*A2).

for my $k ( keys %harry ) {
   my $a1 = $harry{$k};
   my $a2 = $harry{$phash{$k}};

   my %delete_a1;
   my %delete_a2;

   OUTER:
   for $i1 ( 0 .. $#$a1 ) {
     #next if $delete_a1{$i1};  # Never true.

      for $i2 ( 0 .. $#$a2 ) {
         next if $delete_a2{$i2};
         next if $a1->[$i1] ne $a2->[$i2];

         $delete_a1{$i1} = 1;
         $delete_a2{$i2} = 1;
         next OUTER;
      }
   }

   @$a1 = map $a1->[$_], grep !$delete_a1{$_}, 0..$#$a1;
   @$a2 = map $a2->[$_], grep !$delete_a2{$_}, 0..$#$a2;
}
[download]

You can even do better at the cost of readability. The following is O(K*(A1+A2)).

for my $k ( keys %harry ) {
   my $a1 = $harry{$k};
   my $a2 = $harry{$phash{$k}};

   my %seen;
   for ( @$a1 ) {
      push @{ $h1{$a1} }, $_;
   }

   my %delete_a1;
   my %delete_a2;
   for ( 0..$#$a2 } ) {
      my $seen = $seen{ $a2->[$_] };
      if ( $seen && @$seen ) {
         $delete_a1{$_} = shift(@$seen);
         $delete_a2{$_} = 1;
      }
   }   

   @$a1 = map $a1->[$_], grep !$delete_a1{$_}, 0..$#$a1;
   @$a2 = map $a2->[$_], grep !$delete_a2{$_}, 0..$#$a2;
}
[download]

All of these are fairly complicated because I didn't assume @$a1 and @$a2 each contained only unique elements. If @$a1 contains no duplicates, and if @$a2 contains no duplicates, you could use a simple set difference.

for my $k ( keys %harry ) {
   my $a1 = $harry{$k};
   my $a2 = $harry{$phash{$k}};

   my %seen;
   ++$seen{$_} for @$a1, @$a2;

   @$a1 = grep $seen{$_}>1, @$a1;
   @$a2 = grep $seen{$_}>1, @$a2;
}
[download]

In reply to Re: Accessing (deleting) array elements in a hash of arrays (scalability) by ikegami
in thread Accessing (deleting) array elements in a hash of arrays by onslaught

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.