in reply to list of unique strings, also eliminating matching substrings
When you say "about 300 characters long", what is the actual range? Are there any constraints on where a substring may match a larger string? Can there be exact matches within a set of strings and if so should duplicates be removed?
Update: length question already answered I see.
Update: and the key question I didn't ask: how many strings of the original 100,000 do you expect you might end up with after duplicates and substrings are removed?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: list of unique strings, also eliminating matching substrings
by lindsay_grey (Novice) on May 30, 2011 at 22:42 UTC |