Re: Finding duplicate keys

A somewhat straightforward solution may be to just sort both arrays (dunno whether or not order is necessary...you said key, so I suspect that it isn't), use two variables to hold the present indices, and just compare the present keys, comparing each as you go along. This reduces the number of comparisons required to as few as scalar @shorterArray and as many as scalar @shorterArray + scalar @longerArray. As an added bonus, only two extra variables are required and there's no memory overhead involving hashes. Anyhow, as my explanations are usually said to be terrible, the following may be what you're looking for:

Untested code follows:

my @array1 = sort getArray1;
my @array2 = sort getArray2;
my ($i1,$i2) = (0,0);
while (defined($array1[$i1]) && defined($array2[$i2])) {
  if ($array1[$i1] eq $array2[$i2]) {
    print "$array1[$i] is repeated at position array1[$i1] and array2[
+$i2]$/";
    $i1++;
    $i2++;
  }  
  if ($array1[$i1] > $array2[$i2]) { $i2++; }
  else { $i1++; }
}
[download]

Note that the code assumes that each "key" in the individual arrays are unique. Otherwise, it is possible that it'll skip over duplicated keys in cases such as:

@array1 = ( "a", "a", ... );
@array2 = ( "a", "b", ... );

Results:
a is repeated at position array1[0] and array2[0]
[download]

It will catch the first instance where the first a's appear. But, it will not find anything wrong with the a at $array1[1]. Of course, this case can easily be detected.

Updated: a friend thought my explanation was a bit on the terrible side so I added another sentence and reworded the part about comparisons.

antirice
The first rule of Perl club is - use Perl
The ith rule of Perl club is - follow rule i - 1 for i > 1

Comment on Re: Finding duplicate keys Select or Download Code

Replies are listed 'Best First'.
Re^2: Finding duplicate keys (cmp) by Aristotle (Chancellor) on Apr 06, 2003 at 00:35 UTC
Why are you using `eq` in one comparison and `>` in the other? Besides, using `cmp` you can avoid the duplicate comparison altogether. Your looping condition is not robust, either. `my @array1 = sort <$file1>; my @array2 = sort <$file2>; my ($l,$r) = (0) x 2; while ($l < @array1 and $r < @array2) { if(my $b = $array1[$l] cmp $array2[$r]) { ++($b == 1 ? $r : $l); } else { print "Dupe: $array1[$l]\n"; ++$l; ++$r; } }` [download] Makeshifts last the longest.	[reply] [d/l]
Re: Re^2: Finding duplicate keys by antirice (Priest) on Apr 06, 2003 at 01:12 UTC
Thanks for pointing out the inadequacies of my code. I didn't think of using cmp as I haven't used it very frequently. :-/ antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i ] 1	[reply]