in reply to Question about speeding a regexp count

Commander Salamander,
This trades memory for speed but still ran fine on my 256MB machine with a 600_000 byte DNA strand.
my $dna = join '', map { chomp; $_ } <DATA>; my $template = ('AXA2X2A3X2' x (length($dna) - 2)) . 'AXA2XA'; my %count; $count{$_}++ for unpack $template, $dna; print "$_\t$count{$_}\n" for keys %count; __DATA__ AAAAAAAACAAGAATACACAACCACGACTAGAGAAGCAGGAGTATATAATCATGATTCCACAACACCAGC +ATCCCCACCCCCGCCTCGCGACGCCGGCGT CTCTACTCCTGCTTGGAGAAGACGAGGATGCGCAGCCGCGGCTGGGGAGGCGGGGGTGTGTAGTCGTGGT +TTTATAATACTAGTATTCTCATCCTCGTCT TGTGATGCTGGTGTTTTTATTCTTGTTTAACACAACCACTAGAGCAGTATATAATCCCACACCAGCCCCC +CCTCGCGACGGCGTCTCTACTCCTGGGAGA CGAGGATGCGCAGCGGCTGGGGAGGGGTGTAGTCTTATACTAGTATTCTCCTCGTCTTGTGATGCTGGAC +TGGGGTCGATCGTCGAAATCGGCTAGCTAA AAAAAAACAAGAATACACAACCACGACTAGAGAAGCAGGAGTATATAATCATGATTCCACAACACCAGCA +TCCCCACCCCCGCCTCGCGACGCCGGCGTC TCTACTCCTGCTTGGAGAAGACGAGGATGCGCAGCCGCGGCTGGGGAGGCGGGGGTGTGTAGTCGTGGTT +TTATAATACTAGTATTCTCATCCTCGTCTT GTGATGCTGGTGTTTTTATTCTTGTTTAACACAACCACTAGAGCAGTATATAATCCCACACCAGCCCCCC +CTCGCGACGGCGTCTCTACTCCTGGGAGAC
If memory becomes an issue, you could walk the length of the string using substr or unpack. It will require keeping track of your position in the string but both functions provide a means for starting "inside" the string.

Cheers - L~R

Update: After testing the memory consumption, I updated the post to reflect that it may be realistic to use this approach