Thanks for your repaly, Corion.
In the end, you will still need to have all keys in memory, or at least accessible
Why, in case of using a database I can just try to insert a new value. If that value is already exists in the table I'll get an exception 'Cannot insert a duplicated value bla-bla-bla'. But otherwise a new value will be inserted the the table.
a million keys shouldn't eat too much memory
The most important criterion for me is a speed of processing of new values. I haven't use databse approach yet but in case of using a hash a processing of one value takes about 40 seconds with 1 million hash keys. But the number of keys is increased and the time increased too.
---
Michael Stepanov aka nite_man
It's only my opinion and it doesn't have pretensions of absoluteness!
In reply to Re^2: How to remove duplicates from a large set of keys
by nite_man
in thread How to remove duplicates from a large set of keys
by nite_man
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |