in reply to Fast string similarity method

Another similar module to benchmark against would be String::Approx (I haven't done any benchmarking myself, so I can't recommend one over the other).

Replies are listed 'Best First'.
Re^2: Fast string similarity method
by Limbic~Region (Chancellor) on May 29, 2007 at 16:55 UTC
    almut,
    I didn't mention String::Approx because of the disclaimer by the author in the POD:

    NOTE: String::Approx suits the task of string matching, not string comparison, and it works for strings, not for text.

    If you want to compare strings for similarity, you probably just want the Levenshtein edit distance (explained below),...

    String::Approx uses the Levenshtein edit distance (tLed) as its measure, but String::Approx is not well-suited for comparing the tLeds of strings, in other words, if you want a "fuzzy eq", see above. Strings::Approx is more like regular expressions or index(), it finds substrings that are close matches.

    Of course, having more information may reveal that this is the perfect tool for the job and that the other modules are less appropriate so thanks for bringing it up.

    Cheers - L~R