It's a question of whether overlapping matches are wanted or not. The code I posted in Re: FInding the longest match from an initial match between two files
deliberately did not look for overlapping matches.
If overlapping matches are wanted, the regex could be changed to the following:
#!/usr/bin/perl -l use strict; use warnings; my $k = 5; my $file1contents = 'TACATCTCAAAACACTTTCATCTCACGACTACTACTACTACTTCAAAAC +ACCATCAT'; my $file2contents = 'ACTTCAACATAACTACTATATACTACTCATACTACTACTCTTAAAACTA +CTATACTA'; $_ = "$file1contents\n$file2contents"; print "at position $-[0] is match $1" while /(?= (.{$k,}) .* \n .* \1 +)/gx;
And the output from this change is:
at position 8 is match AAAAC at position 27 is match ACTACTACT at position 28 is match CTACTACT at position 29 is match TACTACTACT at position 30 is match ACTACTACT at position 31 is match CTACTACT at position 32 is match TACTACTACT at position 33 is match ACTACTACT at position 34 is match CTACTACT at position 35 is match TACTACT at position 36 is match ACTACT at position 37 is match CTACT at position 39 is match ACTTCAA at position 40 is match CTTCAA at position 41 is match TTCAA at position 44 is match AAAAC
which shows the longer match you found (in fact, two of them, partially overlapping).
It all depends on what the output is going to be used for, I suppose. One of the reasons
I posted the code was to prompt discussion about the problem.
In reply to Re^3: FInding the longest match from an initial match between two files
by tybalt89
in thread FInding the longest match from an initial match between two files
by Allie_grater
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |