kris1511 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks! I am trying to implement Edit Distance Interview problem in Perl, but seems like there is some issue. Below is my code
sub editDistance{ my($str1, $str2, $len1, $len2) = @_; if ($len1 ==0){ return $len2; } if ($len2 ==0){ return $len1; } if(substr($str1, -1) eq substr($str2,-1)){ return editDistance($str1,$str2,$len1-1,$len2-1); } my @distance = (); push @distance, editDistance($str1, $str2, $len1, $len2-1); #inser +t push @distance, editDistance($str1, $str2, $len1-1, $len2); #remov +e push @distance, editDistance($str1, $str2, $len1-1, $len2-1); #rep +lace my $min = min @distance; return 1 + $min; } my $str1 ="Sundayyyy"; my $str2 = "Saturday"; print editDistance($str1, $str2, length($str1), length($str2))
Where did I go wrong ? http://www.geeksforgeeks.org/dynamic-programming-set-5-edit-distance/

Replies are listed 'Best First'.
Re: Edit Distance Implementation in Perl
by roboticus (Chancellor) on Aug 24, 2017 at 04:29 UTC

    kris1511:

    The main reason you're getting the wrong answer is that you're not consistently handling the strings and lengths. Specifically: you're trying to pass the string lengths with the strings so you don't have to modify them, but when you're comparing the specific string endings, you're not taking the string lengths into account. It's probably simpler to just go ahead and modify the strings and pass them around:

    use strict; use warnings; sub shorten { my $tmp = shift; return substr($tmp,0,length($tmp)-1); } sub min { my $rv = shift; while (@_) { my $tmp = shift; $rv = $tmp if $tmp < $rv; } return $rv; } sub editDistance { my ($l, $lenL, $r, $lenR) = map { $_, length($_) } @_; return $lenR unless $lenL; return $lenL unless $lenR; if (substr($l,-1) eq substr($r,-1)) { return editDistance(shorten($l), shorten($r)); } return 1 + min( editDistance($l, shorten($r)), #insert editDistance(shorten($l), $r), #remove editDistance(shorten($l), shorten($r)) #replace ); } my @pairs = ( [ "Sundayyy", "Saturday" ], [ "sunday", "saturday" ], ); for my $ar (@pairs) { my ($str1, $str2) = @$ar; print editDistance($str1,$str2), " <$str1:$str2>\n"; }

    Update: I just noticed in the docs that substr can take a negative length to remove characters from the end. So if you use substr($x,0,-1) everywhere I used shorten you can drop the shorten function altogether, making the solution a little nicer.

    Update 2: Fixed the min routine, as Laurent R observes that my original version would fail miserably if there is a 0 in the list (causing it to terminate prematurely). The example below (with the old min) prints 5. Urk!

    sub min { my $rv = shift; while (my $tmp = shift) { $rv = $tmp if $tmp < $rv; } return $rv; } print "Min test: ", min(5,0,-10), "\n";

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Edit Distance Implementation in Perl
by thanos1983 (Parson) on Aug 24, 2017 at 09:29 UTC

    Hello kris1511,

    I tried several times your sample of code and the link that you provided us until I understand what you where trying to achieve. This is not what it is suppose to happen, this is a free forum that people are trying to assist each other for free. You need to work with your code make sure you provided us all the information that we need to replicate your problem so we can assist you as fast as possible and also to see that you have tried everything possible withing your knowledge before asking a question here. Of course this applies to all of us, not just you.

    Having said that, the solution to your problem is provided bellow:

    #!/usr/bin/perl use strict; use warnings; use feature 'say'; use List::Util qw( min ); sub editDistance{ my ($str1, $str2, $m, $n) = @_; return $n if $m == 0; return $m if $n == 0; return editDistance($str1, $str2, $m-1, $n-1) if substr($str1, $m, 1) eq substr($str2, $n, 1); return 1 + min ( editDistance($str1, $str2, $m, $n-1), #insert editDistance($str1, $str2, $m-1, $n), #remove editDistance($str1, $str2, $m-1, $n-1) ); #replace } my $str1 = "sunday"; my $str2 = "saturday"; say editDistance($str1, $str2, length($str1), length($str2)); __END__ $ perl test.pl 3

    The reason that you where not getting the expected result is because you need to understand what happens with substr. Read again and again and see my update.

    Update: From the nice implementation of fellow monk roboticus, just for testing purposes:

    #!/usr/bin/perl use strict; use warnings; use feature 'say'; use List::Util qw( min ); sub editDistance{ my ($str1, $str2, $m, $n) = @_; return $n if $m == 0; return $m if $n == 0; return editDistance($str1, $str2, $m-1, $n-1) if substr($str1, $m, 1) eq substr($str2, $n, 1); return 1 + min ( editDistance($str1, $str2, $m, $n-1), #insert editDistance($str1, $str2, $m-1, $n), #remove editDistance($str1, $str2, $m-1, $n-1) ); #replace } my @pairs = ( [ "Sundayyy", "Saturday" ], [ "sunday", "saturday" ], ); for my $ar (@pairs) { my ($str1, $str2) = @$ar; say editDistance($str1, $str2, length($str1), length($str2)) . " < +$str1:$str2>"; } __END__ $ perl test.pl 5 <Sundayyy:Saturday> 3 <sunday:saturday>

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!

      thanos1983:

      Nice! I stand corrected--I thought it would be simpler to omit the length passing and directly whack the strings. But I like yours better.

      I did have a brief go at making one that did the length passing, but was bitten by my nemesis--fencepost errors! ;^)

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        Hello roboticus,

        Thanks for the reply, I appreciate your time and effort.

        BR / Thanos,

        Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Edit Distance Implementation in Perl
by Laurent_R (Canon) on Aug 24, 2017 at 11:12 UTC
    I like the functional programming version of roboticus, because it is easier not to be bitten by off-by-one errors.

    This is my slightly modified version of it:

    use strict; use warnings; sub min { my $rv = shift; for my $tmp (@_) { $rv = $tmp if $tmp < $rv; } return $rv; } sub editDistance { my ($left, $lenLeft, $right, $lenRight) = map { $_, length($_) } @ +_; return $lenRight unless $lenLeft; return $lenLeft unless $lenRight; my $shortLeft = substr $left, 0, -1; my $shortRight = substr $right, 0, -1; return editDistance ($shortLeft, $shortRight) if substr($left, -1) + eq substr($right, -1); return 1 + min( editDistance($left, $shortRight), #insert editDistance($shortLeft, $right), #remove editDistance($shortLeft, $shortRight) #replace ); }
    The functional programming version has another distinct advantage.

    The algorithm is very very slow when the input strings grow in size (and even more so if the strings are very different).

    An example with string "LMIjkHFSAE" and "dmqkdjfERZG".

    $ time perl editDistance.pl 11 <LMIjkHFSAE:dmqkdjfERZG> real 2m29.329s user 1m2.578s sys 1m26.452s
    Two and a half minutes for words less than a dozen letters is really bad.

    But with such a functional programming version (no global variable), you can easily use the Memoize module and get an (almost) incredible performance enhancement with the same input strings:

    $ time perl editDistance.pl 11 <LMIjkHFSAE:dmqkdjfERZG> real 0m0.075s user 0m0.015s sys 0m0.030s
    Below is the memoized version of the program (two line changes):
Re: Edit Distance Implementation in Perl
by tybalt89 (Monsignor) on Aug 24, 2017 at 12:48 UTC

    For regex fans, with explicit caching (memoize)

    #!/usr/bin/perl -l # http://perlmonks.org/?node_id=1197896 use strict; use warnings; use List::Util 'min'; my %cache; sub editDistance { local $_ = shift; $cache{$_} //= /(.+)(\0.*)\1$/ ? editDistance("$`$2") : /(.)(\0.*)(.)$/ ? 1 + min map editDistance($_), "$`$2$3", "$`$1$2", "$`$2" : length($_) - 1; } my $str1 ="Sundayyyy"; my $str2 = "Saturday"; print editDistance("$str1\0$str2");
Re: Edit Distance Implementation in Perl
by LanX (Saint) on Aug 23, 2017 at 23:35 UTC
    Sorry "doesn't work" doesn't work.

    Please provide a more detailed problem description.

    See How do I post a question effectively?

    especially

    • Explain in detail what you get (include sample output, error messages and warnings, for example).
    • Tell us how your output means your script isn't working

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!