in reply to Re^4: eval string possibilities
in thread eval string possibilities

An important part of making benchmark code is to ensure the alternatives do the same thing. Your do_eval subroutine had a minor problem (you only had =~ $genome on the first match, instead of all of them), which I corrected. But then when I looked at the output from the two subroutines, I noticed it was different:

sub do_eval { my $genome = "AGTATCGATCGATGCATGCTAGCTAGCTAGCTAGCTAGCTAGSTGCTAGCT"; my @regexes = ('AGT', 'ATC'); # dont care my $count = 0; my $code = 'if (' . join(' && ', map { "\$genome =~ /$_/" } @regexes) . ') { $count++; }'; eval $code; die "Error: $@\n Code:\n$code\n" if ($@); return $count; # returns 1 } sub do_qr { my $string = "AGTATCGATCGATGCATGCTAGCTAGCTAGCTAGCTAGCTAGSTGCTAGCT"; my @regexes = ("AGT", "ATC"); # dont care my $count = 0; my @compiled = map qr/$_/, @regexes; for(my $i=0; $i<@regexes; $i++) { if($string =~ /$compiled[$i]/){ $count++; } } return $count; # returns 2 }

To fix that, I changed your do_eval sub to the following:

sub do_eval { my $genome = "AGTATCGATCGATGCATGCTAGCTAGCTAGCTAGCTAGCTAGSTGCTAGCT"; my @regexes = ('AGT', 'ATC'); # dont care my $count = 0; my $code = join ";", map { "\$count++ if \$genome =~ /$_/" } @regexes; eval $code; die "Error: $@\n Code:\n$code\n" if ($@); return $count; # returns 2 }

And I decided to add my own take on the matter, which generates One Big Regex, rather than a bunch of them:

sub do_genre { my $genome = "AGTATCGATCGATGCATGCTAGCTAGCTAGCTAGCTAGCTAGSTGCTAGCT"; my @regexes = ("AGT", "ATC"); # dont care my $regex = join "|", map "($_)", @regexes; my $count = () = $genome =~ /$regex/; return $count; # returns 2 }

When I run the benchmark, I get the following results:

Rate do_eval do_qr do_genre do_eval 15531/s -- -65% -90% do_qr 44671/s 188% -- -72% do_genre 157893/s 917% 253% --

Which just goes to show that the string eval is slow, but the looping is even slower. A different algorithm makes a big difference.