I like the functional programming version of roboticus, because it is easier not to be bitten by off-by-one errors.

This is my slightly modified version of it:

use strict; use warnings; sub min { my $rv = shift; for my $tmp (@_) { $rv = $tmp if $tmp < $rv; } return $rv; } sub editDistance { my ($left, $lenLeft, $right, $lenRight) = map { $_, length($_) } @ +_; return $lenRight unless $lenLeft; return $lenLeft unless $lenRight; my $shortLeft = substr $left, 0, -1; my $shortRight = substr $right, 0, -1; return editDistance ($shortLeft, $shortRight) if substr($left, -1) + eq substr($right, -1); return 1 + min( editDistance($left, $shortRight), #insert editDistance($shortLeft, $right), #remove editDistance($shortLeft, $shortRight) #replace ); }
The functional programming version has another distinct advantage.

The algorithm is very very slow when the input strings grow in size (and even more so if the strings are very different).

An example with string "LMIjkHFSAE" and "dmqkdjfERZG".

$ time perl editDistance.pl 11 <LMIjkHFSAE:dmqkdjfERZG> real 2m29.329s user 1m2.578s sys 1m26.452s
Two and a half minutes for words less than a dozen letters is really bad.

But with such a functional programming version (no global variable), you can easily use the Memoize module and get an (almost) incredible performance enhancement with the same input strings:

$ time perl editDistance.pl 11 <LMIjkHFSAE:dmqkdjfERZG> real 0m0.075s user 0m0.015s sys 0m0.030s
Below is the memoized version of the program (two line changes):
use strict; use warnings; use Memoize; sub min { my $rv = shift; for my $tmp (@_) { $rv = $tmp if $tmp < $rv; } return $rv; } sub editDistance { my ($left, $lenLeft, $right, $lenRight) = map { $_, length($_) } @ +_; return $lenRight unless $lenLeft; return $lenLeft unless $lenRight; my $shortLeft = substr $left, 0, -1; my $shortRight = substr $right, 0, -1; return editDistance ($shortLeft, $shortRight) if substr($left, -1) + eq substr($right, -1); return 1 + min( editDistance($left, $shortRight), #insert editDistance($shortLeft, $right), #remove editDistance($shortLeft, $shortRight) #replace ); } memoize("editDistance"); editDistance(...);
Update: This memoized version is almost 2,000 faster than the original one (for the same input strings)!

In reply to Re: Edit Distance Implementation in Perl by Laurent_R
in thread Edit Distance Implementation in Perl by kris1511

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.