in reply to Searching first array in second array.

Hi!

You write that you use nested for. That is a very inefficient way to do it. Suppose n is the size of the array, you would need O(n^2) compares. In your case it is ~7000000*7000000, which is very much.

The following algorithm would be much faster O(n) (or ~7000000*3):

  1. create a temporary hash
  2. add all elements of the first array to that hash; use the array-value as a key and 1 as a value
  3. pass through the second array; if the current value is already in the hash, increase its value by 2
  4. pass through the hash; if all values are 3, array1 is part of array2

Another idea could be to sort both arrays (the perl sort-function is more efficient than a "nested-for-(bubble-)sort") and then work on the sorted arrays.

HTH, Rata

PS.: please be aware that the algorithm above does not work if the first array contains some values more than once!
PPS.: and of course you can improve the algorithm by adding checks on "more than once" in step 3 - and aborting the rest of the loops then
  • Comment on Re: Searching first array in second array.

Replies are listed 'Best First'.
Re^2: Searching first array in second array.
by Corion (Patriarch) on Jan 19, 2010 at 12:19 UTC

    As soon as you create "buckets" instead of single hashes, the algorithm also works for multiple elements per key:

    my @left = qw(1 2 3 1 1); my %left; for my $l (@left) { my $key = $l; # maybe you want some more complicated key than the +item itself $left{ $key } ||= []; push @{ $left{ $key } }, $l; }; for my $r (@right) { my $key = $r; # maybe you want some more complicated key than the +item itself if (@{ $left{ $key }}) { my $found = shift @{ $left{ $key }}; print "For $key, I found the pair ($found, $r).\n"; } elsif (exists $left{ $key }) { print "For $key, there is at one item on the right side left o +ver: $r.\n"; } else { print "For $key, there is no item on the left side at all: $r. +\n"; }; }; # Now lets see if we have any values on the left side that did not pai +r: for my $l (keys %left) { my @leftover = @{ $left{ $l } }; for (@leftover) { print "For $l, there was no corresponding element on the right + side ($_)\n"; }; };

    I haven't checked that code, but the approach is one I've used lots and lots when reconciling lists :)