in reply to Re^4: Finding Nearly Identical Sets (Updated:4200/sec)
in thread Finding Nearly Identical Sets

I'm not sure an in-memory solution will work because of parallel processing.

Hm. The primary reason -- there are others -- for using parallel processing is: speed.

I pretty much guarantee that you will not be able to achieve 500/s using a disk-based file or DB let alone 5000/s; -- disk access is at least 100,000 times slower than memory -- which means you now need 10 processors instead on one just to get back to par.

And if 5000/s isn't enough? Put the bitmaps in shared memory (NOT threads::shared) and run multiple threads...

Anyway, good luck with the project which ever way you choose to go :)


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^5: Finding Nearly Identical Sets (Updated:4200/sec)