in reply to Quantitative Change instead of Boolean

One not-so crude method of establishing the similarity between texts was presented to me in String::Trigram, which was used by its author for just that, determining how similar two webpages are. The method is still crude as it doesn't discriminate between HTML and "real" text, but the author claimed it worked well enough for him because minor differences like timestamps etc. made only a small change in the resulting number.

  • Comment on Re: Quantitative Change instead of Boolean

Replies are listed 'Best First'.
Re^2: Quantitative Change instead of Boolean
by shmem (Chancellor) on Jun 29, 2006 at 23:14 UTC
    Still crude, not-so-crude... it just depends on how you twiddle and tweak your ideas - at the beginning often crude or highly volatile - to which ends... ,)

    cheers,
    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}