in reply to (Ovid) Re: Dot star okay, or not?
in thread Dot star okay, or not?

Can you tell my why physi's code is slower than the rest? He suggested this:

$data =~ s/(^\s*|\s*$)//g;

I added these two subs to your benchmark:

sub both_at_once { my $data = $testdata; $data =~ s/(^\s+|\s+$)//g; return $data; } sub both_at_once2 { my $data = $testdata; $data =~ s/(^\s*|\s*$)//g; return $data; }
And this was the result:

Benchmark: timing 100000 iterations of both_at_once, both_at_once2, do +tstar, first_n_last_1, first_n_last_2... both_at_once: 10 wallclock secs ( 9.04 usr + 0.00 sys = 9.04 CPU) @ +11061.95/s (n=100000) both_at_once2: 11 wallclock secs (10.40 usr + 0.00 sys = 10.40 CPU) @ + 9615.38/s (n=100000) dotstar: 9 wallclock secs ( 8.30 usr + 0.00 sys = 8.30 CPU) @ 12 +048.19/s (n=100000) first_n_last_1: 6 wallclock secs ( 5.77 usr + 0.00 sys = 5.77 CPU) +@ 17331.02/s (n=100000) first_n_last_2: 2 wallclock secs ( 2.31 usr + 0.00 sys = 2.31 CPU) +@ 43290.04/s (n=100000)
Unless I'm mistaken, the pattern alternation (^\s+|\s+$) will try to match both patterns on every character. But, does the engine not know to disregard the ^\s+ except at the beginning of the string, and likewise for \s+$, only trying to match at the end? Just curious as to why this is so slow.

Replies are listed 'Best First'.
(Ovid) Re(3): Dot star okay, or not?
by Ovid (Cardinal) on Jul 05, 2001 at 22:16 UTC

    If you really want to get a good handle on how regular expressions work, try reading "Mastering Regular Expressions" by Jeffrey Friedl. Further, you can try the re pragma to see the regex engine at work:

    use strict; use re 'debug'; my $string = 'abcdC'; print "Matched: $1\n" if $string =~ /((?<!b)[cC])/;

    Try various strings and regexes and you'll begin to understand that output. The nice thing is that this will also show you some of the optimizations that the regex engine performs.

    Cheers,
    Ovid

    Vote for paco!

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Re(2): Dot star okay, or not?
by japhy (Canon) on Jul 05, 2001 at 22:00 UTC
    To answer you, no, Perl doesn't optimize your regex to look only at the beginning and end of the string. Sorry.

    japhy -- Perl and Regex Hacker