in reply to Fast algorithm for 2d array queries

Problem You've spent over half your post describing your solution for a minimal version of your problem, that doesn't scale. But you haven't actually state what the problem is.

For example. You've told us that you've a 2d array of integers; that those integers range between 1 & 3e6 (twice) ; and there are ~50000 in each inner array.

But you fail to say how big the main array is?

Or what you need to look up. Ie. what do you start with; and what information do you need to end up with?

Or how many lookups you need to do?

And will you do these lookups once? Or once a week? Or once an hour?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Fast algorithm for 2d array queries

Replies are listed 'Best First'.
Re^2: Fast algorithm for 2d array queries
by baxy77bax (Deacon) on Feb 07, 2014 at 10:24 UTC
    Fair enough

    "But you fail to say how big the main array is? "

    not that big, the size of the main array is 3000. n is 500-1000 (the number of inner arrays from which i need to pick)

    "Or what you need to look up. Ie. what do you start with; and what information do you need to end up with?"

    i start with my query array which is 500-1000 different ints between 0 and 2999 and i need to end up with the int value from inner arrays that is the most prevalent on those picked 500-1000 arrays. (intersect that has the highest number of elements)

    "Or how many lookups you need to do? "

    about 15000000 per hour.

      "Or how many lookups you need to do? " about 15000000 per hour.

      That's 4000 per second or 1 every 1/4 of a millisecond.

      In that time you want to intersect 500 to 1000 (from 3,000) sets of 50,000 integers and extract the single most populous integer across them all.

      Do you have a 3000 machine cluster available to throw at this problem?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.