in reply to Re: Search for identical substrings
in thread Search for identical substrings

I have put <code> tages around the data on my scratchpad.

Yes, a match below 200 charachers doesn't have much meaning. Strings below 200 charachters can occur by random chance with too high a frequency.

I would also like to know how many different identical substrings greater than 200 characters exist between pairs of the 3k strings. I hope this statement makes sense.