in reply to Re^2: Fingerprinting text documents for approximate comparison
in thread Fingerprinting text documents for approximate comparison
The scale -128 to +128 appears to be arbitrary and could just as well be 0-255.
That inference comes from the author's statement re his example output from two slightly variant sources: "The nilsimsa of these two codes is 92 on a scale of -128 to +128. That means that 36 bits are different and 220 bits the same. Any nilsimsa over 24 (which is 3 sigma) indicates that the two messages are probably not independently generated."BTW, I attach high value to the observations (below) from BrowserUK and sfink (but am not sure I'm ready to buy (no offense intended, BrowserUK!) BUK's "no easy way" as (1)gospel nor (2, and more important) as any reason not to search for a way.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Fingerprinting text documents for approximate comparison
by BrowserUk (Patriarch) on Mar 24, 2005 at 20:04 UTC | |
by ww (Archbishop) on Mar 24, 2005 at 20:19 UTC | |
|
Re^4: Fingerprinting text documents for approximate comparison
by Mur (Pilgrim) on Mar 24, 2005 at 21:39 UTC | |
by ww (Archbishop) on Mar 26, 2005 at 16:35 UTC |