in reply to Filtering matches of near-perfect-matched DNA sequence pairs
Hi, onlyIDleft.
Your problem seems interesting.
However, you seem to have missed (I find it hard to believe that you have deliberately ignored...) any requests from fellow monks for concrete examples of string pairs (DNA sequences) that either meet or don't meet your requirements.
To help us (and I agree with anonymonk that "This looks like a really fun problem to work on"), please reply to the following:
a. 9 out of 10 in both align to each other perfectly
No prob:
ACGTACGTAC GCGTACGTAC
That's okay, right?
b. 10 out of 10 in both align to each other perfectly
Even easier to understand:
ACGTACGTAC ACGTACGTAC
Perfect match, right?
c. 9 in one and 10 in other align to each other - with this imperfect alignment due to insertion/deletion
I don't understand. Please supply a couple of examples of pairs that meet/don't meet your requirements (with comments, if necessary)
d. 9 out of 9 in both align to each other,but imperfectly due to substitution - but I will allow only one such substitution - for biological reasons
Ditto
e. 10 out of 10 in both align to each other, but imperfectly due to substitution - but I will allow only one such substitution - again for biological reasons
I think I understand this one, but could you again supply a couple of examples of pairs that meet/don't meet your requirements?
That said, I think that the CPAN module Text::Levenshtein might be what you are looking for. But that could depend on your answers to the above questions...
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Filtering matches of near-perfect-matched DNA sequence pairs
by onlyIDleft (Scribe) on Mar 15, 2015 at 02:05 UTC | |
by BrowserUk (Patriarch) on Mar 15, 2015 at 07:04 UTC |