Accessing (deleting) array elements in a hash of arrays

onslaught has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Accessing (deleting) array elements in a hash of arrays by jwkrahn (Abbot) on Sep 21, 2008 at 23:41 UTC
Your problem could be because delete does not remove an array element it just sets its value to undef. To actually remove an array element you need to use splice but you can't do that from inside a loop that is iterating over that array. Also, `${$harry{$phash{$k}}}[$bcount]` is usually written as `$harry{$phash{$k}}[$bcount]`.	[reply] [d/l] [select]
Re: Accessing (deleting) array elements in a hash of arrays by ysth (Canon) on Sep 22, 2008 at 00:49 UTC
Some sample data would be helpful in figuring out if you have other problems, but you most certainly should not delete from an array you are looping though with foreach. Try looping like this instead: `for ($acount = $#{$harry{$k}}; $acount >= 0;--$account) { my $a = $harry{$k}[$acount]; for ($bcount = $#{$harry{$phash{$k}}}; $bcount >= 0; --$bcount) { my $b = $harry{$phash{$k}}[$bcount]; if ($a eq $b) ... splice(@{$harry{$k}}, $acount, 1); splice(@{$harry{$phash{$k}}}, $bcount, 1); ...` [download] -- Online Fortune Cookie Search Office Space merchandise	[reply] [d/l]
Re: Accessing (deleting) array elements in a hash of arrays by jethro (Monsignor) on Sep 21, 2008 at 23:56 UTC
See manual page perlsyn: If any part of LIST is an array, "foreach" will get very confused if you add or remove elements within the loop body, for example with "splice". So don't do that. UPDATE: jwkrahn is right, delete doesn't shrink the array except when it is the last element, so above manual page excerpt has nothing to do with the problem	[reply]
Re: Accessing (deleting) array elements in a hash of arrays (scalability) by ikegami (Patriarch) on Sep 22, 2008 at 02:58 UTC
Using `splice` is very inefficient. It scales very poorly. The following is O(KA1A2(A1+A2)): `for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; for (my $i1 = @$a1; $i1-- >= 0; ) { for (my $i2 = @$a2; $i2-- >= 0; ) { next if $a1->[$i] ne $a2->[$i2]; splice( @$a1, $i1, 1 ); splice( @$a2, $i2, 1 ); ... } }` [download] If you keep track of the indexes to delete and delete them later, you can greatly improve the scalability. The following is O(KA1A2). `for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; my %delete_a1; my %delete_a2; OUTER: for $i1 ( 0 .. $#$a1 ) { #next if $delete_a1{$i1}; # Never true. for $i2 ( 0 .. $#$a2 ) { next if $delete_a2{$i2}; next if $a1->[$i1] ne $a2->[$i2]; $delete_a1{$i1} = 1; $delete_a2{$i2} = 1; next OUTER; } } @$a1 = map $a1->[$_], grep !$delete_a1{$_}, 0..$#$a1; @$a2 = map $a2->[$_], grep !$delete_a2{$_}, 0..$#$a2; }` [download] You can even do better at the cost of readability. The following is O(K(A1+A2)). `for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; my %seen; for ( @$a1 ) { push @{ $h1{$a1} }, $_; } my %delete_a1; my %delete_a2; for ( 0..$#$a2 } ) { my $seen = $seen{ $a2->[$_] }; if ( $seen && @$seen ) { $delete_a1{$_} = shift(@$seen); $delete_a2{$_} = 1; } } @$a1 = map $a1->[$_], grep !$delete_a1{$_}, 0..$#$a1; @$a2 = map $a2->[$_], grep !$delete_a2{$_}, 0..$#$a2; }` [download] All of these are fairly complicated because I didn't assume @$a1 and @$a2 each contained only unique elements. If @$a1 contains no duplicates, and if @$a2 contains no duplicates, you could use a simple set difference. `for my $k ( keys %harry ) { my $a1 = $harry{$k}; my $a2 = $harry{$phash{$k}}; my %seen; ++$seen{$_} for @$a1, @$a2; @$a1 = grep $seen{$_}>1, @$a1; @$a2 = grep $seen{$_}>1, @$a2; }` [download]	[reply] [d/l] [select]
Re: Accessing (deleting) array elements in a hash of arrays by sflitman (Hermit) on Sep 21, 2008 at 23:59 UTC
I think the problem is where you use ${...}$index. Try accessing array elements in hash of arrays as: `$harry{$phash{$k}}->[$bcount]` The other problem is I don't think you can call delete on array elements like that, at least not in Perl 5.8x. The docs say all you get is undef at that position and the later elements don't shift down, you need splice for that. I think I'd probably approach this general problem differently, using grep. `#!/usr/bin/perl use strict; use Data::Dumper; my %hashOfArrays=( a=>[1,2,3], b=>[4,5,6], c=>[7,8,9] ); my %toDelete=( a=>2, b=>6, c=>8 ); for my $array (keys %hashOfArrays) { if ($toDelete{$array}) { $hashOfArrays{$array}=[ grep { $_ != $toDelete{$array} } @{$hashOfArrays{$array}} ]; } } print Dumper(\%hashOfArrays); # prints $VAR1 = { # 'c' => [ # 7, # 9 # ], # 'a' => [ # 1, # 3 # ], # 'b' => [ # 4, # 5 # ] # }; exit;` [download] I think this is what you're describing. A hash of arrays contains data which you may wish to selectively delete by specifying which array in the hash, and which element to delete. This code only allows one element at a time to be specified per array, but it wouldn't be hard to support multiple elements to be deleted if needed. Rather than use delete, I create a new anonymous array reference with `[ EXPR ]` and the array itself is made by `grep EXPR ARRAY` which converts the original array into a array of elements which do not equal the element to delete. Hope that helps. SSF	[reply] [d/l] [select]
Re: Accessing (deleting) array elements in a hash of arrays by toolic (Bishop) on Sep 22, 2008 at 02:03 UTC
In addition to the helpful advice already given, a handy debugging tool is Data::Dumper. This can help to get an idea of your array/hash contents at arbitrary points in your code: `print Dumper(\%harry);` [download] It is also customary to post (small) segments of your data structures here to help us help you.	[reply] [d/l]
Re: Accessing (deleting) array elements in a hash of arrays (looping over indexes) by ikegami (Patriarch) on Sep 22, 2008 at 02:33 UTC
`my $i = 0; for my $ele (@array) { ... $i++; }` [download] is better written as `for my $i (0..$#array) { my $ele = $array[$i]; ... }` [download] It's much more readable, and it's not broken by the use of `next`. (And you don't have to worry about putting the `$i=0` in the wrong place like you did.) The downside is that `$ele` is no longer an alias. Just use `$array[$i]` directly if you need to modify the original array. Just can use `$array[$i]` directly, period.	[reply] [d/l] [select]
Re: Accessing (deleting) array elements in a hash of arrays by gone2015 (Deacon) on Sep 22, 2008 at 10:07 UTC
So, you are stepping through two arrays using `foreach $a` and `foreach $b`, and in the inner loop you need the indexes for `$a` (`$acount`) and `$b` (`$acount`). You are incrementing `$acount` and `$bcount` in the right place -- but you need to set `$bcount` to zero just before the `foreach $b`. I suggest that this accounts for your immediate problem. Other's have pointed out that `delete` on arrays doesn't actually remove an array element, and if it did who knows whether `foreach` would cope. From a style perspective I suggest that calling these indexes `$acount` obscures their purpose -- which is probably why the problem was hard to see. I would have called them `$ia` and `$ib`. Probably best to stick the initialisation of the index hard up against the related `foreach`. Also, from a style perspective I wonder whether: `my $ia = 0 ; foreach my $a (@afoo) { ... acres of stuff ... $ia++ ; } ;` [download] is as clear as: `for my $ia (0..$#afoo) { my $a = $afoo[$ia] ; } ;` [download] because in the second case the loop control is all at the top of the loop. Mind you, the two are not equivalent, because of the quantum entanglement between `$a` and the array element in the first ! Perhaps better is: `my $ia = -1 ; foreach my $a (@afoo) { $ia++ ; ...... } ;` [download] although I cannot help feeling that setting `$ia` to `-1` is ugly :-(	[reply] [d/l] [select]