Re: redundancy Checker

One way that comes to mind is to use a sorted list of the data. When you get a new element, find its place in the list. If it's already there, then you don't need to add it. If it's not there, then add it in the proper place.

HTH.

Comment on Re: redundancy Checker

Replies are listed 'Best First'.
Re^2: redundancy Checker by Tanktalus (Canon) on Jul 27, 2005 at 03:14 UTC
Um, wouldn't that be faster by just inserting into a hash - when you get a new element, see if it's already in the hash? However, the OP is really asking much more about SQL than perl. Not that it's off-topic in my opinion, but answers should probably focus on the OP's problem area. Lists and hashes are unlikely to be the best solutions. The problem really is in defining what "duplicate" is. And then devising either tables or SQL queries (or some combination of both) to expose those duplicates. Even using lists or hashes, we still would need a better definition of what "duplicate" means to know what to sort on, or what to use as the hash key.	[reply]

Replies are listed 'Best First'.

Re^2: redundancy Checker
by Tanktalus (Canon) on Jul 27, 2005 at 03:14 UTC

Um, wouldn't that be faster by just inserting into a hash - when you get a new element, see if it's already in the hash?

However, the OP is really asking much more about SQL than perl. Not that it's off-topic in my opinion, but answers should probably focus on the OP's problem area. Lists and hashes are unlikely to be the best solutions.

The problem really is in defining what "duplicate" is. And then devising either tables or SQL queries (or some combination of both) to expose those duplicates. Even using lists or hashes, we still would need a better definition of what "duplicate" means to know what to sort on, or what to use as the hash key.

[reply]