in reply to Regex KungFu help needed
As moritz pointed out use re 'eval' will solve your problem :
use warnings; use strict; use re 'eval'; my @real_count = (0,0,0,0); my $sequence = "GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA"; my @pattern; $pattern[0] = "AAAAA"; $pattern[1] = "GGGGG"; $pattern[2] = "GGAGA"; $pattern[3] = "GAAGG"; for (my $i=0; $i <= 3; $i++) { $sequence =~ /$pattern[$i](?{$real_count[$i]++})(?!)/; } foreach (@real_count) { print "$_\n"; ## prints ## 11 ## 3 ## 1 ## 1 }
It is totally personal preference, but I think i prefer modifying pos for finding overlapping matches :
use warnings; use strict; my @real_count = (0,0,0,0,); my $sequence = "GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA"; my @patterns = qw/AAAAA GGGGG GGAGA GAAGG/; for my $i ( 0 .. $#patterns ) { while ( $sequence =~ m/$patterns[$i]/g ){ $real_count[$i]++; ## reset start position for next global match search pos($sequence) -= (length$patterns[$i]) -1; } } foreach (@real_count) { print "$_\n"; ## prints ## 11 ## 3 ## 1 ## 1 }
I guess this is mainly a maintainability thing, because being a regex whizz is one thing, but gods help whoever has to maintain the code after you! If you are worried about which is faster (i guess you are not just matching 5 base patterns against 30 or so nucleotides) then there is a lot of info in the monastery about Benchmarking.
|
|---|