Yes. It is mentioned as a caveat in the post that where the subsequence length is not an exact multiple of the sequence length, that the short subsequences will need to be removed (as I did in my second attempt at Re^3: Question about speeding a regexp count).
But then again, it is so slow compared to the other methods that it doesn't really warrent consideration anyway. I did come up with this version:
sub browser2 {
my %count;
$count{ A } = $seq =~ tr[A][A];
$count{ C } = $seq =~ tr[C][C];
$count{ G } = $seq =~ tr[G][G];
$count{ T } = $seq =~ tr[T][T];
for( qw'AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT' ) {
my $p=0;
$count{ $_ }++ while $p = 1+index $seq, $_, $p;
}
for( qw[
TTT TTG TTC TTA TGT TGG TGC TGA TCT TCG TCC TCA TAT TAG TAC TA
+A
GTT GTG GTC GTA GGT GGG GGC GGA GCT GCG GCC GCA GAT GAG GAC GA
+A
CTT CTG CTC CTA CGT CGG CGC CGA CCT CCG CCC CCA CAT CAG CAC CA
+A
ATT ATG ATC ATA AGT AGG AGC AGA ACT ACG ACC ACA AAT AAG AAC AA
+A
] ) {
my $p=0;
$count{ $_ }++ while $p = 1+index $seq, $_, $p;
}
1;
}
Which is a lot quicker, but still much slower than skeeve's and one of sauoq.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
|