in reply to Re: How can I count the number of substitutions of one letter in a string by another letter in the other string?
in thread How can I count the number of substitutions of one letter in a string by another letter in the other string?

Whilst orders of magnitude faster, as posted, your solution is missing some possibilities.

Given the input of:

$seq1 = 'AAAACCCCGGGGTTTT'; $seq2 = 'ACGTACGTACGTACGT';

It should produce 12 counts:

{ A2C => 1, A2G => 1, A2T => 1, C2A => 1, C2G => 1, C2T => 1, G2A => 1, G2C => 1, G2T => 1, T2A => 1, T2C => 1, T2G => 1, }

But only produces the following eight:

A2G=1; G2A=1; C2T=1; T2C=1; A2T=1; A2C=1; G2T=1; G2C=1.

I'm sure that this is eminently fixable.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

  • Comment on Re^2: How can I count the number of substitutions of one letter in a string by another letter in the other string?
  • Select or Download Code

Replies are listed 'Best First'.
Re^3: How can I count the number of substitutions of one letter in a string by another letter in the other string?
by Eliya (Vicar) on May 11, 2012 at 23:19 UTC
    your solution is missing some possibilities.

    Well, I implemented (on purpose) exactly the list the OP asked for.  It can of course be extended, if so desired:

    my @cnt = qw(A2G G2A C2T T2C A2T A2C G2T G2C ...);

    Alternatively, simply compute the full matrix of count routines:

    #!/usr/bin/perl -w use strict; my %cnt; my @c = qw(A T C G); for my $c1 (@c) { for my $c2 (@c) { next if $c1 eq $c2; (my $t = $c2) =~ tr/ATCG/HRDZ/; my $x = $c1 ^ $t; $cnt{$c1."2".$c2} = eval "sub { \$_[0] =~ tr/$x/$x/ }"; } } my $seq1 = 'AAAACCCCGGGGTTTT'; my $seq2 = 'ACGTACGTACGTACGT'; (my $tmp = $seq2) =~ tr/ATCG/HRDZ/; my $diff = $seq1 ^ $tmp; print join("; ", map { $_."=".$cnt{$_}->($diff) } sort keys %cnt ), ". +"; __END__ A2C=1; A2G=1; A2T=1; C2A=1; C2G=1; C2T=1; G2A=1; G2C=1; G2T=1; T2A=1; +T2C=1; T2G=1.