I haven't used it much myself; looking at the man page, it seems like you might need to split each string into an array of characters -- something like this (not tested):
I'm just guessing about that. Do heed Grandfather's request, and show us some data (and maybe some other code you've tried that really is more relevant).#!/usr/bin/perl use strict; use Algorithm::Diff qw/LCS/; my @strings; # fill @strings with your 300 elements of 3k characters each, then ... for my $i ( 0 .. $#strings-1 ) { my @seq1 = split //, $strings[$i]; for my $j ( $i+1 .. $#strings ) { my @seq2 = split //, $strings[$j]; my @lcs = LCS( \@seq1, \@seq2 ); print "LCS for $i :: $j is ", join( "",@lcs,"\n" ); } }
(updated code to fix a variable-name typo and the comment, and to declare @strings; also added spacing in the print statement, to avoid misuse of "::" as a namespace designation. Tested it on a sample text file (380 lines but less than 100 chars/line), and it seemed to do what's desired** -- not tremendously fast, but not impossibly slow.)
(** second update: ** then again, looking at the output, it's not clear that my script is actually making correct use of the module -- I don't seem to be getting a single contiguous "LCS" string from each comparison as I was expecting, but rather a bunch of fragments that are each one or more characters long. Better read further in the man page...)
In reply to Re: Search for identical substrings
by graff
in thread Search for identical substrings
by bioMan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |