Hi,
I have two strings that I want to compute the number of mismatches between them. These two strings are of the "same" size. Let's call them 'source' string and 'target' string. Now, the problem is that the 'source' string may come in ambiguous form, meaning that in one position they may contain more than 1 (upto 4) characters. The ambiguous position is marked with square bracketed [ATCG] region.

Is there an efficient way I can compute the number of mismatch between "source" and "target", of the above situation? The example is as follows:

Example 1 (where the source is ambiguous):
my $source1 = '[TCG]GGGG[AT]'; # ambiguous my $target1 = 'AGGGGC'; # No of mismatch = 2 on position 1 and 6 my $target2 = 'TGGGGC'; # No of mismatch = 1 on position 6 only
I have no problem when dealing when 'source' string is not ambigous like this
Example 2 (where the source is NOT ambiguous):
my $source2 = 'TGGGGT'; # not-ambiguous my $target1 = 'AGGGGC'; # No of mismatch = 2 on position 1 and 6 my $target3 = 'TGGGGT'; # No of mismatch = 0 all position matches
For example I can use bitwise operator to do it. But for ambigous case I'm lost.

PS: "target" string can be assumed to be never ambiguous.

Regards,
Edward

In reply to Mismatch Positions of Ambiguous String by monkfan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.