comment on

It's a question of whether overlapping matches are wanted or not. The code I posted in Re: FInding the longest match from an initial match between two files deliberately did not look for overlapping matches.

If overlapping matches are wanted, the regex could be changed to the following:

#!/usr/bin/perl -l

use strict;
use warnings;

my $k = 5;

my $file1contents = 'TACATCTCAAAACACTTTCATCTCACGACTACTACTACTACTTCAAAAC
+ACCATCAT';
my $file2contents = 'ACTTCAACATAACTACTATATACTACTCATACTACTACTCTTAAAACTA
+CTATACTA';

$_ = "$file1contents\n$file2contents";

print "at position $-[0] is match $1" while /(?= (.{$k,}) .* \n .* \1 
+)/gx;
[download]

And the output from this change is:

at position 8 is match AAAAC
at position 27 is match ACTACTACT
at position 28 is match CTACTACT
at position 29 is match TACTACTACT
at position 30 is match ACTACTACT
at position 31 is match CTACTACT
at position 32 is match TACTACTACT
at position 33 is match ACTACTACT
at position 34 is match CTACTACT
at position 35 is match TACTACT
at position 36 is match ACTACT
at position 37 is match CTACT
at position 39 is match ACTTCAA
at position 40 is match CTTCAA
at position 41 is match TTCAA
at position 44 is match AAAAC
[download]

which shows the longer match you found (in fact, two of them, partially overlapping).

It all depends on what the output is going to be used for, I suppose. One of the reasons I posted the code was to prompt discussion about the problem.

In reply to Re^3: FInding the longest match from an initial match between two files by tybalt89
in thread FInding the longest match from an initial match between two files by Allie_grater

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.