in reply to Re^3: Longest common substring with N mismatches
in thread Longest common substring with N mismatches

I am sorry, but I can't understand if this post deals with the same issue.
As it seems to me, the one that is asking this post needs matches of N-mers, but I was asking if there is a way to find just the longest one that would tolerate a defined number of mismatches/replacements (e.g. 1 or 2).
  • Comment on Re^4: Longest common substring with N mismatches

Replies are listed 'Best First'.
Re^5: Longest common substring with N mismatches
by LanX (Saint) on Sep 11, 2017 at 17:05 UTC
      Sorry, I meant to write k-mer before :)
      One example could be the following (using also the node you mentioned as input):
      $str1='AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'; $str2='RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRAAAA +AAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAABAAZZZZZZZ';

      In this case, the whole $str1 should much because it only has 1 mismatch (assuming the user sets the allowed #mismatches to 1) and not only the AAAAA before the B.
        Looks like the same solution fits, xor both strings at different positions and count the longest runs of zeros tolerating n non zeros in between, probably with a composed regex.

        See ^ in perlop#Bitwise-Or-and-Exclusive-Or

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!