in reply to Finding a LCS module on word level

Algorithm::Diff seems to do what you are looking for.

use strict; use warnings; use Algorithm::Diff qw/LCSidx/; my $str1 = "I am trying to find a perl LCS module in perl monk"; my $str2 = "perl monk"; print longest( $str1, $str2 ); sub longest { my @seq1 = split /\s+/, $_[0]; my @seq2 = split /\s+/, $_[1]; my ( $idx1, $idx2 ) = LCSidx( \@seq1, \@seq2 ); my @list = @seq1[@$idx1]; return join( ' ', @list ), '[', join( ',', @$idx1 ), ']'; }

Yields:

perl monk[10,11]

Replies are listed 'Best First'.
Re^2: Finding a LCS module on word level
by almut (Canon) on May 09, 2008 at 15:16 UTC

    Algorithm::Diff finds the longest common subsequence, which is not necessarily the same as the longest common substring. So, I don't think it's what the OP is looking for, even though the result obtained with the given sample strings does look correct... (as in this particular case, the longest common subsequence also is the longest common substring). This is not always true, however. For example, try modifying $str2 to read "a perl monk". Algorithm::Diff::LCSidx will (correctly) identify this longest common subsequence:

    a perl monk[5,10,11]

    which is not a substring of $str1...