What you really need is to align the two texts with a "dynamic programming" algorithm. This is a common task in bioinformatics - but the atomic unit there is a single character - and there is a small number of expected characters (usually 4 or 20). You would have to hack it a fair bit to work with an array of words from an essentially unlimited "character set" - but I haven't looked in detail at the code:
Bio::Tools::dpAlign
For quick and dirty I would extend the hash comparison approach to handle words, word pairs, triplets and maybe more.
Also maybe keep searching CPAN maybe there's something else out there.