in reply to how can I speed up this perl??

To use a hash as all other monks have suggested surely makes your code much more beautiful and structured.

If I speak strictly to the speed problem, the major issue is that you are repeating the [] operation unneccessarily, when you only need it twice. String operation is very expensive.

You should be able to speed your code up, simply by doing this right before your if-else chain:

my $a = $genome[$i]; my $b = $genome[$i + 1];

And in the subsequent code, only use $a and $b, not [] operation any more.

This is a direct answer to your speed issue. Don't get me wrong, you still should use hash as your storage, as it not only makes your code more structured, as a matter fact, but also removes the unnecessary usage of [] operation.

Replies are listed 'Best First'.
Re: how can I speed up this perl??
by Abigail-II (Bishop) on Nov 24, 2003 at 16:58 UTC
    Your suggestion will save some, but just some. You've replaced the fetching of a value from walking 4 pointers to walking 2 pointers. That's a peephole optimization; the big gain is to be made by not comparing so much.

    Even if you don't want to use a hash, you can still do:

    if ($genome [$i] eq 'a') { if ($genome [$i + 1] eq 'a') {$aa ++} elsif ($genome [$i + 1] eq 'c') {$ac ++} elsif ($genome [$i + 1] eq 'g') {$ag ++} elsif ($genome [$i + 1] eq 't') {$at ++} } elsif ($genome [$i] eq 'c') { if ($genome [$i + 1] eq 'a') {$ca ++} elsif ($genome [$i + 1] eq 'c') {$cc ++} ... etc ...
    which reduces the number of comparisons from max 20 to max 5.

    Abigail

      Yep, obviously the comparison is also a part of the problem. As a matter of fact, both of our posts in this sub-thread shall only be understood as performance analysis, the actual implementation still shall go after hash, which has everything resolved in one shot.

        Going a bit offtopic, how does the internal hash lookup compare to the array solution he has now? It seems to me that internally, Perl will be doing nearly the same thing - comparing the given key to each of the keys in the hash.