It's a question of whether overlapping matches are wanted or not. The code I posted in Re: FInding the longest match from an initial match between two files deliberately did not look for overlapping matches.

If overlapping matches are wanted, the regex could be changed to the following:

#!/usr/bin/perl -l use strict; use warnings; my $k = 5; my $file1contents = 'TACATCTCAAAACACTTTCATCTCACGACTACTACTACTACTTCAAAAC +ACCATCAT'; my $file2contents = 'ACTTCAACATAACTACTATATACTACTCATACTACTACTCTTAAAACTA +CTATACTA'; $_ = "$file1contents\n$file2contents"; print "at position $-[0] is match $1" while /(?= (.{$k,}) .* \n .* \1 +)/gx;

And the output from this change is:

at position 8 is match AAAAC at position 27 is match ACTACTACT at position 28 is match CTACTACT at position 29 is match TACTACTACT at position 30 is match ACTACTACT at position 31 is match CTACTACT at position 32 is match TACTACTACT at position 33 is match ACTACTACT at position 34 is match CTACTACT at position 35 is match TACTACT at position 36 is match ACTACT at position 37 is match CTACT at position 39 is match ACTTCAA at position 40 is match CTTCAA at position 41 is match TTCAA at position 44 is match AAAAC

which shows the longer match you found (in fact, two of them, partially overlapping).

It all depends on what the output is going to be used for, I suppose. One of the reasons I posted the code was to prompt discussion about the problem.


In reply to Re^3: FInding the longest match from an initial match between two files by tybalt89
in thread FInding the longest match from an initial match between two files by Allie_grater

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.