in reply to Edit Distance Implementation in Perl

I like the functional programming version of roboticus, because it is easier not to be bitten by off-by-one errors.

This is my slightly modified version of it:

use strict; use warnings; sub min { my $rv = shift; for my $tmp (@_) { $rv = $tmp if $tmp < $rv; } return $rv; } sub editDistance { my ($left, $lenLeft, $right, $lenRight) = map { $_, length($_) } @ +_; return $lenRight unless $lenLeft; return $lenLeft unless $lenRight; my $shortLeft = substr $left, 0, -1; my $shortRight = substr $right, 0, -1; return editDistance ($shortLeft, $shortRight) if substr($left, -1) + eq substr($right, -1); return 1 + min( editDistance($left, $shortRight), #insert editDistance($shortLeft, $right), #remove editDistance($shortLeft, $shortRight) #replace ); }
The functional programming version has another distinct advantage.

The algorithm is very very slow when the input strings grow in size (and even more so if the strings are very different).

An example with string "LMIjkHFSAE" and "dmqkdjfERZG".

$ time perl editDistance.pl 11 <LMIjkHFSAE:dmqkdjfERZG> real 2m29.329s user 1m2.578s sys 1m26.452s
Two and a half minutes for words less than a dozen letters is really bad.

But with such a functional programming version (no global variable), you can easily use the Memoize module and get an (almost) incredible performance enhancement with the same input strings:

$ time perl editDistance.pl 11 <LMIjkHFSAE:dmqkdjfERZG> real 0m0.075s user 0m0.015s sys 0m0.030s
Below is the memoized version of the program (two line changes):
use strict; use warnings; use Memoize; sub min { my $rv = shift; for my $tmp (@_) { $rv = $tmp if $tmp < $rv; } return $rv; } sub editDistance { my ($left, $lenLeft, $right, $lenRight) = map { $_, length($_) } @ +_; return $lenRight unless $lenLeft; return $lenLeft unless $lenRight; my $shortLeft = substr $left, 0, -1; my $shortRight = substr $right, 0, -1; return editDistance ($shortLeft, $shortRight) if substr($left, -1) + eq substr($right, -1); return 1 + min( editDistance($left, $shortRight), #insert editDistance($shortLeft, $right), #remove editDistance($shortLeft, $shortRight) #replace ); } memoize("editDistance"); editDistance(...);
Update: This memoized version is almost 2,000 faster than the original one (for the same input strings)!