in reply to Style question: regex versus string builtin function
which gets the match position with a regex is slightly faster (but much uglier of course):my $pos; if ( $line =~ $regex ) { $pos = length $`; }
Update: For better ways of getting the match position, see How do I retrieve the position of the first occurrence of a match?.
Benchmark code:
#!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all) ; my $count = 5000; my $filename = 'TEST.dat'; my $DELIMITER = 'GGAGAGGG'; #my $DELIMITER = 'TTTTCATGAAGAAGATGAGAGACAAGATGAGAAAATAGTATCAGAGA'; my $regex = qr{\Q$DELIMITER}o; cmpthese($count, { 'index' => sub { open my $FH, '<', $filename; my $i; while (my $line = <$FH>) { my $pos = index $line, $DELIMITER; if ( $pos >= 0 ) { $i++; } } close $FH; }, 'regex_compiled_pos' => sub { open my $FH, '<', $filename; my $i; while (my $line = <$FH>) { my $pos; if ( $line =~ $regex ) { $i++; $pos = length $`; } } close $FH; }, 'regex_compiled' => sub { open my $FH, '<', $filename; my $i; while (my $line = <$FH>) { if ( $line =~ $regex ) { $i++; } } close $FH; }, 'regex_pos' => sub { open my $FH, '<', $filename; my $i; while (my $line = <$FH>) { my $pos; if ( $line =~ /\Q$DELIMITER/ ) { $i++; $pos = length $`; } } close $FH; }, 'regex' => sub { open my $FH, '<', $filename; my $i; while (my $line = <$FH>) { if ( $line =~ /\Q$DELIMITER/ ) { $i++; } } close $FH; }, });
Rate index regex_pos regex regex_compiled_pos rege +x_compiled index 450/s -- -38% -39% -40% + -41% regex_pos 728/s 62% -- -2% -3% + -5% regex 741/s 65% 2% -- -1% + -3% regex_compiled_pos 749/s 66% 3% 1% -- + -2% regex_compiled 763/s 70% 5% 3% 2% + --
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Style question: regex versus string builtin function
by oha (Friar) on Oct 02, 2007 at 12:27 UTC | |
by tye (Sage) on Oct 02, 2007 at 13:53 UTC | |
by lima1 (Curate) on Oct 02, 2007 at 13:17 UTC | |
by ikegami (Patriarch) on Oct 02, 2007 at 14:15 UTC | |
by lima1 (Curate) on Oct 03, 2007 at 09:17 UTC | |
by eyepopslikeamosquito (Archbishop) on Oct 02, 2007 at 13:38 UTC |