in reply to Re^3: Fingerprinting text documents for approximate comparison
in thread Fingerprinting text documents for approximate comparison

Yeah ... I don't get where he gets "92" from. Like I said, I thought of ANDing the two 256-bit numbers together and counting 1s.
--
Jeff Boes
Database Engineer
Nexcerpt, Inc.
vox 269.226.9550 ext 24
fax 269.349.9076
 http://www.nexcerpt.com
...Nexcerpt...Connecting People With Expertise

Replies are listed 'Best First'.
Re^5: Fingerprinting text documents for approximate comparison
by ww (Archbishop) on Mar 26, 2005 at 16:35 UTC
    re "92" -- 92 decimal = 128 - 36

    re AND
    sprintf to binary, first??
     sprintf (%b,($num0 & $num1))
    then count the "1"s in the return....

    . o O: hope I'm not "teaching my grandparent how to suck eggs."

    (Updated, 28 Mar 05 with addtl refs below:)
    Potentially useful refs:

    http://www3.interscience.wiley.com/cgi-bin/abstract/102525285/ABSTRACT ( J.ASIS&T )

    Approximate Text Addressing (ATA) (qv): referenced in: http://www.springerlink.com/app/home/contribution.asp?wasp=800f3c48701942fcab2e5a0926d5068e&referrer=parent&backto=issue,1,25;journal,738,1955;linkingpublicationresults,1:105633,1 ( Lecture Notes in Computer Science, Publisher: Springer-Verlag GmbH )

    Improved robustness of signature-based near-replica detection via lexicon randomization, https://portal.acm.org/poplogin.cfm?dl=GUIDE&coll=GUIDE&comp_id=1014127&want_href=delivery%2Ecfm%3Fid%3D1014127%26type%3Dpdf&CFID=40869960&CFTOKEN=47922358&td=1112025019666 ( ACM account required ) (and BTW, scholar.google.com finds numerous other papers in the past couple years, but which, like this one, require $$$ accounts).