in reply to Re^2: Super fast file creation needed
in thread Super fast file creation needed
I understood that the logfiles were huge, which IMHO, makes storing the entire lines as hash keys impractical due to memory considerations.
Sure, computing checksums/digests might slow things down some, but it is one way to identify whether a line has been seen or not. With the proper digest length, hash key collisions could be virtually eliminated.
In this case, I think the memory considerations outweigh the speed considerations, but it would certainly be prudent to benchmark both ways to see which one works better.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Super fast file creation needed
by dsheroh (Monsignor) on Oct 19, 2007 at 14:50 UTC |