So the best way to solve this is to actually have the sequence in a plain string, and walk that string in a perlish way. Here substr() seems a good operator (you can also use a pattern match that selects 2 chars at a time, but that turns out to be slower).
Making perl fast is also to a great extent reducing the amount of opcodes that get executed. So your multiple if tests want to be replaced by something that takes less operations. As pointed out by the other answers, you can use a hash here. While a hash lookup is slightly more work than a simple test, it only has to be done once.
So the code becomes:
This is about 4 times as fast as an array based solution using a hash on my perl.my %count; $count{substr($genome, $_, 2)}++ for 0..length($genome)-2; # Next combine the counts like in the original code # is this a bug or intentional ? $count{tt} += $count{aa}; $count{ag} += $count{ct}; $count{ac} += $count{gt}; $count{tg} += $count{ca}; $count{ga} += $count{tc}; $count{cc} += $count{gg};
If this still isn't fast enough, you can start looking at things like Inline::C.
In reply to Re: how can I speed up this perl??
by thospel
in thread how can I speed up this perl??
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |