Re: Regex KungFu help needed

As moritz pointed out use re 'eval' will solve your problem :

use warnings;
use strict;

use re 'eval';

my @real_count = (0,0,0,0);
my $sequence = "GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA";
my @pattern;
$pattern[0] = "AAAAA";
$pattern[1] = "GGGGG";
$pattern[2] = "GGAGA";
$pattern[3] = "GAAGG";

for (my $i=0; $i <= 3; $i++) {
    $sequence =~ /$pattern[$i](?{$real_count[$i]++})(?!)/;
}

foreach (@real_count) {
    print "$_\n";

  ## prints
  ## 11
  ## 3
  ## 1
  ## 1


}
[download]

It is totally personal preference, but I think i prefer modifying pos for finding overlapping matches :

use warnings;
use strict;

my @real_count = (0,0,0,0,);
my $sequence = "GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA";
my @patterns = qw/AAAAA GGGGG GGAGA GAAGG/;

for my $i ( 0 .. $#patterns ) {
    while ( $sequence =~ m/$patterns[$i]/g ){
        $real_count[$i]++;
        ## reset start position for next global match search
        pos($sequence) -= (length$patterns[$i]) -1; 
    }
}

foreach (@real_count) {
    print "$_\n";

  ## prints
  ## 11
  ## 3
  ## 1
  ## 1
}
[download]

I guess this is mainly a maintainability thing, because being a regex whizz is one thing, but gods help whoever has to maintain the code after you! If you are worried about which is faster (i guess you are not just matching 5 base patterns against 30 or so nucleotides) then there is a lot of info in the monastery about Benchmarking.

Just a something something...

Comment on Re: Regex KungFu help needed Select or Download Code