in reply to Re: Fingerprinting text documents for approximate comparison
in thread Fingerprinting text documents for approximate comparison
That isn't going to be useful. the md5 algorithm is expressly design to detect differences, not similarity:
use Digest::MD5 qw[md5_hex]; my $s = 'the quick brown fox jumps over the lazy dog'; print md5_hex $s; 77add1d5f41223d5582fca736a5cb335 print md5_hex $s . 's'; 5e48a737eaff799917707b2815af10fc print md5_hex $s . 'S'; d02763729a741eed14417a1051ec228c
Even the addition of a single character, or changing a single bit produces a (numerically) completely unrelated digest--exactly as it should for the purposes for which md5 is designed, but completely wrong for this application.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Fingerprinting text documents for approximate comparison
by gam3 (Curate) on Mar 25, 2005 at 03:11 UTC |