in reply to speedy way to compare elements in pairs of arrays?

I think we have a XY Problem problem here

To store 10^12 arrays of size 1000 you would need at least 1000 Terabyte. Since you probably don't have that amount of storage I assume you already multiplied your 1 million arrays to arrive at that number. Whatever you do you will never speed up the comparisions enough to make 10^12 comparisions fast. You have to ask instead how to avoid making 10^12 comparisions.

I also assume the sets of different elements is relatively small and the criteria are not fixed, i.e. it is not always element 3 you are comparing.

Then a solution could be to use a database or even simple files where you store the id of all arrays that have element m as the n-th element

So if you use files, you would in a first step preprocess your arrays and create the files a0,a1,a2...a1000,b1...b1000,c1...c1000,... . In file a0 you would store all array-ids where an a is in position 0. So both of your example arrays above would be listed in a0.

That would be a few thousand files, but a modern filesystem can handle that easily. Now your example search would work differently: You would simply list all array id-pairs in the files a0+b0,a0+c0,a0+d0...,b0+c0,b0+d0,... except for the files a0+x0,b0+x0... If there are none, resume with a1+b1,a1+c1...

Note that if you have more criteria to further restrict your solution space, those criteria should be able to prune down the file pairs *before* you have to read any of these files

This solution might not be the best way to represent your data. Provide more information and a better solution might surface

  • Comment on Re: speedy way to compare elements in pairs of arrays?

Replies are listed 'Best First'.
Re^2: speedy way to compare elements in pairs of arrays?
by AnomalousMonk (Archbishop) on May 26, 2009 at 09:31 UTC