Hi, onlyIDleft.

Your problem seems interesting.

However, you seem to have missed (I find it hard to believe that you have deliberately ignored...) any requests from fellow monks for concrete examples of string pairs (DNA sequences) that either meet or don't meet your requirements.

To help us (and I agree with anonymonk that "This looks like a really fun problem to work on"), please reply to the following:

a. 9 out of 10 in both align to each other perfectly

No prob:

ACGTACGTAC GCGTACGTAC

That's okay, right?


b. 10 out of 10 in both align to each other perfectly

Even easier to understand:

ACGTACGTAC ACGTACGTAC

Perfect match, right?


c. 9 in one and 10 in other align to each other - with this imperfect alignment due to insertion/deletion

I don't understand. Please supply a couple of examples of pairs that meet/don't meet your requirements (with comments, if necessary)


d. 9 out of 9 in both align to each other,but imperfectly due to substitution - but I will allow only one such substitution - for biological reasons

Ditto


e. 10 out of 10 in both align to each other, but imperfectly due to substitution - but I will allow only one such substitution - again for biological reasons

I think I understand this one, but could you again supply a couple of examples of pairs that meet/don't meet your requirements?


That said, I think that the CPAN module Text::Levenshtein might be what you are looking for. But that could depend on your answers to the above questions...


Update: Corrected copy/paste error(s)

In reply to Re: Filtering matches of near-perfect-matched DNA sequence pairs by Not_a_Number
in thread Filtering matches of near-perfect-matched DNA sequence pairs by onlyIDleft

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.