One possible weakness with my argument above: These perhaps
aren't the best demonstration cases. Japhy's cases are much
more indepth.
When you benchmark the above 3* example, it doesn't always
work out more optimal. (See below)
With sexegers, I've found there isn't always a 100% clear cut
case where it's faster than regexes, simply because
there are a lot of variables such as:
- If you just want a match that doesn't return any results,
for instance to indicate with boolean value if the match is
true;
- whether or not you take into account the assignment of
the match to a variable - which in the case of sexeger means
you have to reverse it back to it's original form at some
point;
- if you have to do real-time reversal to the string your
searching _on_ (implicit in my example is the assumption this
is done ahead of time, hence mine isn't the best example) -
Japhy refers to this as caching;
- if you do the reversal of the match ouput to it's regular
form during or after the match stage - buffering until later
nets huge gains, where this is possible.
The overhead sometimes nullifies or makes the gains of
sexegers negative. So it really does take some benchmarking
if there isn't a very obvious difference such as with the
case with the 4* examples above (quite obvious upfront.)
None of those example sexegers will run faster than 4p1s0
no matter what you do to them.
# Pre-reversed input strings, boolean match:
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 9 wallclock secs ( 7.05 usr + 0.00 sys = 7.05 CPU) @
+ 14174.97/s (n=100000)
sexeger_3p1s0: 6 wallclock secs ( 6.16 usr + 0.00 sys = 6.16 CPU) @
+ 16243.65/s (n=100000)
Rate regexes_3p0s1 sexeger_3p1s0
regexes_3p0s1 14175/s -- -13%
sexeger_3p1s0 16244/s 15% --
# OTF-reversed input strings, non-reversing output:
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 8 wallclock secs ( 7.05 usr + 0.00 sys = 7.05 CPU) @
+ 14174.97/s (n=100000)
sexeger_3p1s0: 10 wallclock secs ( 7.90 usr + 0.00 sys = 7.90 CPU) @
+ 12660.73/s (n=100000)
Rate sexeger_3p1s0 regexes_3p0s1
sexeger_3p1s0 12661/s -- -11%
regexes_3p0s1 14175/s 12% --
# Pre-reversed input strings, non-reversing ouput:
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 16 wallclock secs (14.16 usr + 0.02 sys = 14.19 CPU) @
+ 7048.46/s (n=100000)
sexeger_3p1s0: 15 wallclock secs (12.84 usr + 0.00 sys = 12.84 CPU) @
+ 7785.89/s (n=100000)
Rate regexes_3p0s1 sexeger_3p1s0
regexes_3p0s1 7048/s -- -9%
sexeger_3p1s0 7786/s 10% --
# Pre-reversed input strings, reversing ouput
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 18 wallclock secs (14.26 usr + 0.00 sys = 14.26 CPU) @
+ 7013.70/s (n=100000)
sexeger_3p1s0: 23 wallclock secs (15.14 usr + 0.00 sys = 15.14 CPU) @
+ 6604.75/s (n=100000)
Rate sexeger_3p1s0 regexes_3p0s1
sexeger_3p1s0 6605/s -- -6%
regexes_3p0s1 7014/s 6% --
# Pre-reversed input strings, non-reversing backreferenced output
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 20 wallclock secs (15.84 usr + 0.00 sys = 15.84 CPU) @
+ 6311.64/s (n=100000)
sexeger_3p1s0: 17 wallclock secs (14.61 usr + 0.00 sys = 14.61 CPU) @
+ 6844.92/s (n=100000)
Rate regexes_3p0s1 sexeger_3p1s0
regexes_3p0s1 6312/s -- -8%
sexeger_3p1s0 6845/s 8% --
# Pre-reversed input strings, reversing backreferenced output:
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 18 wallclock secs (15.79 usr + 0.00 sys = 15.79 CPU) @
+ 6333.50/s (n=100000)
sexeger_3p1s0: 20 wallclock secs (16.42 usr + 0.00 sys = 16.42 CPU) @
+ 6089.44/s (n=100000)
Rate sexeger_3p1s0 regexes_3p0s1
sexeger_3p1s0 6089/s -- -4%
regexes_3p0s1 6333/s 4% --
#!/usr/bin/perl
use Benchmark qw(timethese cmpthese );
my %step3list = (
icate => 'ic',
ative => '',
alize => 'al',
iciti => 'ic',
ical => 'ic',
ful => '',
ness => '',
foo => 'bar'
);
my @step3list_reg = keys %step3list;
my @step3list_sex = map $_=reverse, (keys %step3list);
my @step4list =qw(
al able ance ant ate
ence ent ement er
ible ic ism iti ive ize
ment
ou ous
boo
);
my @step4list_sex = map $_=reverse, @step4list;
sub COUNT (){ 100_000 }
#sub COUNT (){ 1 }
my $ta = timethese (COUNT, {
regexes_4p0s1 => sub {
my $match;
foreach (@step4list){
#print "$_: ";
#if (($match) = /^
if (/^
( a (?: ble | n(?:ce|t) | te | l )
| e (?: ment | n(?:ce|t) | r )
| i (?: ble | sm | ti | ve | ze | c )
| ment
| ous?
)
$/x) {
#print "matches expected set using regexes. $match:$`|$&|$+|$
+'\n";
} else {
#print "doesn't match expected set using regexes.\n";
}
}
},
sexeger_4p0s1 => sub {
my $match;
foreach (@step4list_sex){
#print "$_: ";
#if (($match) = /^
if (/^
#4p0s1r: # No performance gains matching in reverse
( (?: elb | (?:ec|t)n | et | l ) a
| (?: tnem | (?:ec|t)n | r ) e
| (?: elb | ms | it | ev | ez | c ) i
| tnem
| s?uo
)
$/x) {
#$match = reverse $match;
#print "matches expected set using sexeger. $match:$`|$&|$+|$
+'\n";
} else {
#print "doesn't match expected set using sexeger.\n";
}
}
},
sexeger_4p1s1 => sub {
my $match;
foreach (@step4list_sex){
#print "$_: ";
#if (($match) = /^
if (/^
#4p1s1r: # However, this has significant performance gains
( (?: e(?:lb|cn|t) | tn | l ) a
| (?: tnem | (?:ec|t)n | r ) e
| (?: e(?:lb|[vz]) | it | ms | c ) i
| tnem
| s?uo
)
$/x) {
#$match = reverse $match;
#print "matches expected set using sexeger. $match:$`|$&|$+|$
+'\n";
} else {
#print "doesn't match expected set using sexeger.\n";
}
}
},
});
cmpthese($ta);
my $tb = timethese (COUNT, {
regexes_3p0s1 => sub {
my $match;
foreach (@step3list_reg){
#print "$_: ";
#if (($match) = /^
if (/^
#3p0s1: # Less optimal
#(
(?:
(?: icat | ativ | aliz ) e
| iciti
| (?: ica | fu ) l
| ness
)
$/x) {
#$match = $1;
#print "matches expected set using regexes. $match:$`|$&|$+|$
+'\n";
} else {
#print "doesn't match expected set using regexes.\n";
}
}
},
sexeger_3p1s0 => sub {
my $match;
#foreach (@step3list_reg){
foreach (@step3list_sex){
#print "$_: ";
#if (reverse=~/^
#if (($match) = /^
if (/^
#3p1s0r: # More optimal - performance gains in reverse
#(
(?:
e (?: taci | vita | zila )
| itici
| l (?: aci | uf )
| ssen
)
$/x) {
#$match = $1;
#$match = reverse $1;
#$match = reverse $match;
#print "matches expected set using sexeger. $match:$`|$&|$+|$
+'\n";
} else {
#print "doesn't match expected set using sexeger.\n";
}
}
}
});
cmpthese($tb);
__END__
=head1 NAME
sextest.pl - Test Sexeger performance
=head1 DESCRIPTION
An example script to test various sexeger performance
attributes.
=head1 AUTHOR
"darksym" :)
=head1 Benchmark Example Ouput
Benchmark: timing 100000 iterations of regexes_4p0s1, sexeger_4p0s1, s
+exeger_4p1
s1...
regexes_4p0s1: 23 wallclock secs (18.66 usr + 0.01 sys = 18.66 CPU) @
+ 5357.89/s
(n=100000)
sexeger_4p0s1: 33 wallclock secs (26.89 usr + 0.03 sys = 26.92 CPU) @
+ 3714.45/s
(n=100000)
sexeger_4p1s1: 29 wallclock secs (24.48 usr + 0.02 sys = 24.49 CPU) @
+ 4082.93/s
(n=100000)
Rate sexeger_4p0s1 sexeger_4p1s1 regexes_4p0s1
sexeger_4p0s1 3714/s -- -9% -31%
sexeger_4p1s1 4083/s 10% -- -24%
regexes_4p0s1 5358/s 44% 31% --
Benchmark: timing 100000 iterations of regexes_3p0s1, sexeger_3p1s0...
regexes_3p0s1: 8 wallclock secs ( 7.10 usr + 0.00 sys = 7.10 CPU) @
+ 14081.41/
s (n=100000)
sexeger_3p1s0: 6 wallclock secs ( 6.26 usr + -0.01 sys = 6.25 CPU) @
+ 16000.00/
s (n=100000)
Rate regexes_3p0s1 sexeger_3p1s0
regexes_3p0s1 14081/s -- -12%
sexeger_3p1s0 16000/s 14% --
=cut