in reply to Levenstein distance transcription
It's a fairly popular algorhitm - you can find the implementation in Perl both on Rosetta code and on Wikibooks. For your convenience, I'm posting the copied Wikibooks version below:
use List::Util qw(min); sub levenshtein { my ($str1, $str2) = @_; my @ar1 = split //, $str1; my @ar2 = split //, $str2; my @dist; $dist[$_][0] = $_ foreach (0 .. @ar1); $dist[0][$_] = $_ foreach (0 .. @ar2); foreach my $i (1 .. @ar1){ foreach my $j (1 .. @ar2){ my $cost = $ar1[$i - 1] eq $ar2[$j - 1] ? 0 : 1; $dist[$i][$j] = min( $dist[$i - 1][$j] + 1, $dist[$i][$j - 1] + 1, $dist[$i - 1][$j - 1] + $cost ); } } return $dist[@ar1][@ar2]; }
Update: I'm sorry, I missed the "based on words" part of your question. Thankfully, Eily already answered your question, reading it more carefully than I did.
- Luke
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Levenstein distance transcription
by Eily (Monsignor) on Dec 05, 2014 at 13:48 UTC | |
|
Re^2: Levenstein distance transcription
by wollmers (Scribe) on Dec 05, 2014 at 22:41 UTC |