in reply to Quantitative Change instead of Boolean
One not-so crude method of establishing the similarity between texts was presented to me in String::Trigram, which was used by its author for just that, determining how similar two webpages are. The method is still crude as it doesn't discriminate between HTML and "real" text, but the author claimed it worked well enough for him because minor differences like timestamps etc. made only a small change in the resulting number.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Quantitative Change instead of Boolean
by shmem (Chancellor) on Jun 29, 2006 at 23:14 UTC |