Re^2: Parser Performance Question

I agree with LanX: we need a short, self-contained example. For example I tried with this, but it doesn't reproduce the problem:

$n = 'x' x 50 . "\n";
$p = "=foo $n";
$np = ($n x 50) . $p;
$_= $np x 100_000;

1 while m/\G ( = [a-zA-Z] .* ) \n/xgc;
[download]

In fact for me, 5.20 is 3 times faster than 5.18 with that example. Since for 5.20.0 I heavily reworked the part of the regex engine which is giving those debugging messages you show, I'd be very interested to have access to real working examples of where my changes made things go slower rather than faster.

Dave.

Comment on Re^2: Parser Performance Question Download Code

Replies are listed 'Best First'.
Re^3: Parser Performance Question by songmaster (Beadle) on Oct 19, 2017 at 19:23 UTC
Sorry for the delay in responding further to this, and thanks to everyone for their input. The fix I have committed for now was to move the `.` match out into a separate regex from the `= [a-zA-Z]` part and this works okay, but I would prefer something slightly less ugly. Here is some stand-alone code that demonstrates the regression, although it doesn't show quite as dramatic a slow-up as my original: `#!env perl $l = 'x' x 50 . "\n"; $x = $l x 50; $p = "=foo bar\n"; $_= ($x . $p) x 500 . $x; $nx = 0; while (1) { if (m/\G ( = [a-zA-Z] . ) \n/xgc) { $pod .= $1; } elsif (m/\G x+ \n/xgc) { # match xxx lines $nx++; } else { last; } }` [download] My results show this taking 3-4 times as long under 5.20.0 as under 5.18.0: `woz$ perlbrew use 5.18.0 woz$ time perl re.pl real 0m0.035s user 0m0.026s sys 0m0.004s woz$ perlbrew use 5.20.0 woz$ time perl re.pl real 0m0.128s user 0m0.120s sys 0m0.005s` [download] - Andrew	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: Parser Performance Question
by songmaster (Beadle) on Oct 19, 2017 at 19:23 UTC

Sorry for the delay in responding further to this, and thanks to everyone for their input. The fix I have committed for now was to move the .* match out into a separate regex from the = [a-zA-Z] part and this works okay, but I would prefer something slightly less ugly.

Here is some stand-alone code that demonstrates the regression, although it doesn't show quite as dramatic a slow-up as my original:

#!env perl

$l = 'x' x 50 . "\n";
$x = $l x 50;
$p = "=foo bar\n";
$_= ($x . $p) x 500 . $x;

$nx = 0;

while (1) {
    if (m/\G ( = [a-zA-Z] .* ) \n/xgc) {
        $pod .= $1;
    }
    elsif (m/\G x+ \n/xgc) {
        # match xxx lines
        $nx++;
    }
    else {
        last;
    }
}
[download]

My results show this taking 3-4 times as long under 5.20.0 as under 5.18.0:

woz$ perlbrew use 5.18.0
woz$ time perl re.pl 

real    0m0.035s
user    0m0.026s
sys    0m0.004s
woz$ perlbrew use 5.20.0
woz$ time perl re.pl 

real    0m0.128s
user    0m0.120s
sys    0m0.005s
[download]

[reply]
[d/l]
[select]