Here is an easy one for all of you gurus. There is a premium on efficiency as I need to conduct the following operation on approximately 1 X 10^8 comparisons.
I need to evaluate if a pair of strings of the same length are identical after excluding all positions that have an 'N' in one or both strings. Both strings will always consist of the character set ATGCN.
For example, the comparison of $a and $b would meet my criterion:
$a = 'ATGNCNC';
$b = 'ATGACNN';
But $c and $d would not:
$c = 'ATGNCNC';
$d = 'TTGNNNC';
(because the first character differs at a position that does not contain an 'N' in either string)
Again, there is an extreme premium on efficiency here. Any thoughts?