in reply to Tips on how to perform this regex query

I'd check CPAN first for an approximate string match module. Using the search 'edit distance' comes up with multiple candidates that look like good possibilities: Text::LevenshteinXS, Text::WagnerFischer, Text::Brew.

...roboticus

When your only tool is a hammer, all problems look like your thumb.

  • Comment on Re: Tips on how to perform this regex query

Replies are listed 'Best First'.
Re^2: Tips on how to perform this regex query
by BrowserUk (Patriarch) on Jan 11, 2014 at 13:12 UTC

    Sorry roboticus, but what do you think running one of those 'edit distance' algorithms on the two strings will actually tell the OP?

    To save time, I'll tell you. Ta-dah:

    1073!

    Which means what?

    My best guess:

    The best interpretation is that is a fuzzy measure of the difference in length of the two strings (actually 1069).

    Which mean that a simple: print length( $bigstring ) - length( $smallstring ); would be more accurate and about a million times faster.

    It doesn't tell him how different they are, nor where the best match occurs,

    Not even if the shorter actually appears in the larger to any meaningful measure of the term.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.