in reply to Pattern searching allowing for mis-matches...
My favorite trick for this sort of fuzzy matching is to combine bit wise xor and the transliteration operator (used in this context for counting). Consider:
use strict; use warnings; my $target = 'TGATTGAATCAAGGTGTTTT'; my $match = 'TGAT'; my $quality = 0.75; my $matchLen = length $match; my $matchNum = int ($matchLen * $quality); for my $offset (0 .. length ($target) - $matchLen) { my $test = substr $target, $offset, $matchLen; my $matched = ($test ^ $match) =~ tr/\x00//; next if $matched < $matchNum; print "Found <$test> at offset $offset which matches in $matched p +laces\n"; }
Prints:
Found <TGAT> at offset 0 which matches in 4 places Found <TGAA> at offset 4 which matches in 3 places Found <TGTT> at offset 14 which matches in 3 places
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Pattern searching allowing for mis-matches...
by MaroonBalloon (Acolyte) on Dec 15, 2009 at 07:53 UTC | |
by GrandFather (Saint) on Dec 15, 2009 at 19:07 UTC | |
by MaroonBalloon (Acolyte) on Dec 15, 2009 at 20:24 UTC |