i'm still relatively new at perl but getting better and better with practice. my question for now is how to compare two strings (of the same length of let's say about 40 characters) and obtain the counts of potentially mismatching characters. take, for example, the following two strings:
target = ATTCCGGG str1 = ATTGCGGG str2 = ATACCGGC
i would like to compare str1 to "target" and str2 to "target" and count the mismatch type at the mismatching position. comparing str1 to target gives one mismatch at position 3 which is a C->G. comparing str2 to target gives two mismatches at positions 2 and 7 which are T->A and G->C respectively. is there an efficient way to do this for millions of different targets and strings?
i have the following code using PDL:this code doesn't give me the specific types of mismatches though. an A,T,G, or C in the target can transform into an A,T,G, or C in the strings, so i would like to keep track of these conversions. any advice?use PDL; use PDL::Char; + $PDL::SHARE=$PDL::SHARE; # keep stray warning quiet my $source=PDL::Char->new("ATTCCGGG"); + for my $str ( "ATTGCGGG") { my $match =PDL::Char->new($str); + my @diff=which($match!=$source)->list; + print "@diff\n"; + }
Original content restored above by GrandFather
oops...i deleted my post...i'm trying to find the positions and types of differences between two strings. take the "target" and the strings:
$target = "ATTCCGGG"; $str1 = "ATTGCGGG"; # 1 mismatch with target at position 3 (C->G) $str2 = "ATACCGGC"; # 2 mismatches with target at position 2 and 7 (T->A and G->C)
how do i go about obtaining the differences between millions of targets and strings in an efficient manner? i have the following code using PDL:but this only gives me positions. how do i look for the actual conversions that occur too? an A,T,C,G can convert to an A,T,C, or G respectively. any advice?use PDL; use PDL::Char; + $PDL::SHARE=$PDL::SHARE; # keep stray warning quiet my $source=PDL::Char->new("ATTCCGGG"); + for my $str ( "ATTGCGGG", "ATACCGGC") { + my $match =PDL::Char->new($str); + my @diff=which($match!=$source)->list; + print "@diff\n"; + }
In reply to mismatching characters in dna sequence by prbndr
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |