in reply to comparing 2 files problem
The hash approach is probably best, if you can guarantee that file 2 will always be small enough to fit into memory. Iterating through file 1, looking for an equal key in the hash holding file 2 will be an O(n) operation (the hash lookup will be O(1)). Yes, there is some time involved in building the hash, but that's only done once, so at worst, you would be looking at O(2N), which isn't really big-oh (constant multipliers are usually not considered). Whereas iterating through file 1, and greping file 2 for the same line will be O(n^2) (assuming the second file is about the same size as the first).
One possibility exists for which your question remained silent: What happens if something in File 2 doesn't exist in file 1? The methods proposed will silently allow that to happen, and in fact, your question leads me to believe that's fine too. But just in case, you should realize that your question didn't cover that possibility -- probably not a problem, but something to remember.
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: comparing 2 files problem
by atcroft (Abbot) on Sep 07, 2004 at 17:11 UTC |