Re: Levenstein distance transcription

It doesn't look like there is. If you have few unique words (by taking smaller portions of a bigger string if necessary) you could always replace each word by a char and run the character version of the algorithm on it:

use v5.14;
use Data::Dump qw/pp/;

my @chars = ('0'..'9', 'a'..'z', 'A'..'Z');
# Up to scalar(@chars) different words (actually @chars+1 because of u
+ndef, but that wouldn't help readability)

$_ = <<STR;
Jack and Jill went up the hill to fetch a pail of water
Jack fell down and broke his crown and Jill came tumbling after
STR

my @words = /\w+/g;
my %replace;
my $asChars = join '', map { $replace{$_}//=shift(@chars) } @words; # 
+'defined-orcish manoeuver' :D
# say pp \%replace;
say $asChars;
my %reverse = reverse %replace;
say join ' ', map $reverse{$_}, split //, $asChars;


__DATA__
0123456789abc0de1fgh12ijk
Jack and Jill went up the hill to fetch a pail of water Jack fell down
+ and broke his crown and Jill came tumbling after
[download]

Edit: for the comparison to work, you have to use the same %replace hash for all strings. And the $h{$_}//=NewVal() idiom (Orcish Maneuver) means that any word that's already known will be replaced by the existing substitute, while an unknown word will add a new entry in the hash. Here I use // instead of || because otherwise '0' would be an invalid (false) character.

Comment on Re: Levenstein distance transcription Select or Download Code