Re: comparing arrays
by ikegami (Patriarch) on Dec 16, 2004 at 01:31 UTC
|
my %lookup;
my @to_keep;
foreach (0..$#array1) {
my $a1 = $array1[$_];
my $a2 = $array2[$_];
next if $lookup{$a1};
next if $lookup{$a2};
$lookup{$a1} =
$lookup{$a2} = 1;
push(@to_keep, $_);
}
@array1 = @array1[@to_keep];
@array2 = @array2[@to_keep];
The above will yield "interesting" results for
- @array1 = (1, 1, 4); @array2 = (4, 5, 6); --> @array1 = (1); @array2 = (4);
- @array1 = (1, 1, 5); @array2 = (4, 5, 6); --> @array1 = (1, 5); @array2 = (4, 6);
| [reply] [d/l] |
Re: comparing arrays
by sauoq (Abbot) on Dec 16, 2004 at 01:09 UTC
|
When this happens, I simply want to remove one copy of the pair and keep the other (remove one element from each array)
You don't explain which array the element should be removed from. In your example, you show one of a set of duplicates being removed from the first array and one from the other set of duplicates being removed from the other array. Could they always be removed from the same array? Do you wish to switch off and remove from first one, then the other, then the first, etc.?
Once you figure that out, it should be pretty easy to do. Hint: use a hash (or two if necessary.) The keys of a hash are unique.
-sauoq
"My two cents aren't worth a dime.";
| [reply] |
|
|
Hi,
Sorry I thought I had explained it. I want to remove just one copy of the duplicate pair e.g. one value from each array - it doesn't matter which array the values are removed from. I dont see how a hash would work - it would help extract the unique values in each array, but how could I use it to keep one copy of the duplicate values?
Thanks!
| [reply] |
|
|
it doesn't matter which array the values are removed from.
In that case, it is very simple. You iterate over one array and rebuild it. If a value shows up in the second array, you just ignore it as you are rebuilding. Use a hash to store the values so that lookup is fast...
my @array1 = (1, 2, 3, 4, 5);
my @array2 = (2, 4, 6, 8, 10);
my %hash = map {$_=>1} @array2;
@array1 = grep { not exists $hash{$_} } @array1;
print "@array1\n";
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] |
|
|
Re: comparing arrays
by prasadbabu (Prior) on Dec 16, 2004 at 01:30 UTC
|
| [reply] |
Re: comparing arrays
by ikegami (Patriarch) on Dec 16, 2004 at 01:46 UTC
|
Or maybe you want to just skip single elements, without caring if you end up with pairs or not.
@array1 = qw(1 1 5 3);
@array2 = qw(4 5 6 1);
my %lookup;
sub filter {
return 0 if $lookup{$_};
$lookup{$_} = 1;
return 1;
}
@array1 = grep filter, @array1;
@array2 = grep filter, @array2;
print('@array1 = (', join(', ', @array1), ")\n"); # 1, 5, 3
print('@array2 = (', join(', ', @array2), ")\n"); # 4, 6
| [reply] [d/l] |
Re: comparing arrays
by nedals (Deacon) on Dec 16, 2004 at 07:32 UTC
|
I'm reading this differently...
I simply want to remove one copy of the pair
This would indicate that pairs somehow got reversed and duplicated.
So duplicate pairs needs to be removed, resulting in 2 equal length arrays.
use strict;
my @dataA = qw(1 9 3 5 4 2);
my @dataB = qw(3 2 1 6 7 9);
my $i = 0;
foreach my $num (@dataA) {
foreach (@dataB) {
if ($num == $_) {
splice(@dataA,$i,1);
splice(@dataB,$i,1);
}
}
$i++;
}
print "@dataA\n@dataB\n";
| [reply] [d/l] |
Re: comparing arrays
by Anonymous Monk on Dec 17, 2004 at 00:26 UTC
|
I got the same set of pairs left in the arrays as TedPride, but I found a problem with his code in that he is splicing from the front of the array and that makes the indexes further along wrong, I believe. The reason it worked OK with this data set is because the only duplicate pair to be spliced from the arrays is the last element. I got warnings when I moved the duplicate from the last pos. in the arrays to the next to the last. But for some reason, the data came out OK (but with the warnings)! I eliminated the warnings with for (reverse 0..$#array1) change to the for loop.
Ned's works but I don't know why indexing a higher number $i++ after a splice doesn't cause problems as the array keeps getting resized with splice. Hope someone might know. A link from c.l.p.m.,
http://groups-beta.google.com/group/comp.lang.perl.misc/msg/49831a95770a2ee5 dicusses this and there were a few links in my Perl Monks search, which seemed to indicate counting backwards in the for loop.
My solution walked from the end of the array to the front, splicing when duplicate pairs were found.
#!/usr/bin/perl
use strict;
use warnings;
my @a1 = qw(13470660 13471850 14028274 14028286);
my @a2 = qw(14028145 14028286 13476691 13471850);
my %hash;
for (reverse 0..$#a1) {
# Does 2 checks. To see if a number from the second array was
# already seen in the first array. Also, checks to see if its
# a 'reversal' or flip flop and thus a duplicate.
if (exists $hash{$a2[$_]} && $hash{$a2[$_]} == $a1[$_]) {
splice @a1, $_, 1;
splice @a2, $_, 1;
}
else {
$hash{$a1[$_]} = $a2[$_];
}
}
print "@a1\n@a2\n";
Chris
| [reply] [d/l] [select] |
|
|
Ned's works but I don't know why indexing a higher number, $i++, after a splice doesn't cause problems as the array keeps getting resized with splice. Hope someone might know.
The answer lies in the result..
1 9 3 5 4 2
3 2 1 6 7 9
At i=0, the inner foreach loop takes out the first match set at index 0; and increments to 1.
9 3 5 4 2
2 1 6 7 9
Then at i=4, instead of taking out the first match set it takes out the second.
9 3 5 4
2 1 6 7
| [reply] |
Re: comparing arrays
by TedPride (Priest) on Dec 16, 2004 at 19:53 UTC
|
How large are the arrays going to be, and how often / how many dupes are there likely to be? My solution below assumes that the arrays are fairly small and won't suffer much from being modified in place. I'm also assuming that you only have pairs (both arrays same length, no missing array cells).
The interesting thing about my solution is that it returns dupe counts, so you could theoretically even sort dupes by the number of times they appear.
use strict;
use warnings;
my @array1 = qw(13470660 13471850 14028274 14028286);
my @array2 = qw(14028145 14028286 13476691 13471850);
my (%keys, $key, @dupes);
for (0..$#array1) {
if ($array1[$_] < $array2[$_]) { $key = "$array1[$_] $array2[$_]";
+ }
else { $key = "$array2[$_] $array1[$_]"; }
if ($keys{$key}++ == 1) {
splice(@array1, $_, 1); splice(@array2, $_, 1);
push(@dupes, $key);
}
}
for (0..$#array1) {
print "$array2[$_] $array1[$_]\n";
}
print "\n$_ ".($keys{$_}-1) for (@dupes);
| [reply] [d/l] |