Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: regex gotcha moving from 5.8.8 to 5.30.0?

by mordibity (Acolyte)
on Feb 09, 2021 at 21:17 UTC ( [id://11128154]=note: print w/replies, xml ) Need Help??


in reply to Re: regex gotcha moving from 5.8.8 to 5.30.0?
in thread regex gotcha moving from 5.8.8 to 5.30.0?

Hmm, that's interesting, there is some data-dependency! My first attempt to make some fake data, like yours, didn't lead to any performance difference between 5.8.8 and 5.30.0. So I made the data-faker a little smarter (in particular, multi-line begfoo declarations) and was able to get a delta to show up:

my $num = shift or die "num?\n"; for my $i (0 .. $num) { my @in = map { "input$_" } (0..int(rand(100))); my @out = map { "output$_" } (0..int(rand(100))); print "begfoo FOO_$i (\n", join(",\n", @in, @out), ");\n"; print " input $_;\n" foreach @in; print " output $_;\n" foreach @out; print " foo inst$_ (j, k, l, m, n, o, p);\n" foreach 0 .. int(ran +d(100)); print "endfoo\n\n"; }

I generated some dummy output with 50000 definitions: "make_out.pl 50000 > 50k.foo", giving a file about 263Mb and that was large/real enough to show a definite 2x difference:

  • 5.8.8 : 0.01s user 0.02s system 0% cpu 14.474 total
  • 5.30.0 : 0.01s user 0.02s system 0% cpu 37.312 total

Replies are listed 'Best First'.
Re^3: regex gotcha moving from 5.8.8 to 5.30.0?
by choroba (Cardinal) on Feb 09, 2021 at 22:44 UTC
    I usually use re 'debug'; when debugging regular expressions, but I'm not sure it's helpful in this case.

    The usual suspects are .* or .*?, because they start by matching the whole string and then backtracking to match less. Can't you replace them with [^;]* or similar?

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      Thx for the idea; I just tried it (using [^;]*? instead of .*? in the two locations) but it didn't really change the (crude) runtimes -- 5.8.8 is over twice as fast as 5.30.0 (~14sec vs ~37sec) on the fake data, and 10x as fast (~6sec vs 105sec) on the real data...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11128154]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-03-28 17:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found