Nice replies so far. I would add that any single score returned by String::Compare is probably not intended to be "intuitively" meaningful in itself; instead, the intent seems to be that when looking for "fuzzy matches" among a set of strings, the relative scores might serve as a means for ranking them in terms of their degree of similarity.

For example, if you had a $str3 set to "don dispeigne", comparing this one to the other two would yield scores that were "worse" (lower, meaning "less similar") than the 0.39871 that you got from comparing your first two strings.

I think the issue depends on what you intend to do with the "similarity score" (or "difference score") once you have it: what is your purpose in assigning a numeric value to the difference between two strings?

Having looked at the source code for String::Compare, it's hard to say how its scoring would compare to that of Text::Levenshtein (or to any sort of arithmetic result you might be able to derive from using Algorithm::Diff, which is another fine and powerful module).

I like String::Compare's notion of combining results from a variety of fairly simple, linguistically-motivated tests (score just the consonants, then just the vowels, then all letters, then all characters, then weighted in some way according to "word count", etc). But the implementation in that module seems awfully simple... I've used Algorithm::Diff and I've read its manual; because of that, I'm inclined to think that the problem is not as simple as String::Compare's methods would suggest.


In reply to Re: compare strings intuitively by graff
in thread compare strings intuitively by rsiedl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.