in reply to Is regcomp slower in 5.16.2 than in 5.8.8. How to speed things up ?

5.10.0 introduced the TRIE mechanism which allows alternations ('|') to be matched more efficiently (especially with many alternatives), at the cost of greater set-up time at the start of the alternation.

The examples you've shown have tended to be simple alternations that emphasise start-up time rather than fail-and-try-another-alternative time, which is why most (but not) are faster in 5.8.x.

Also, micro-benchmarks like these tend to be very sensitive to particular optimisations: change the pattern slightly, and you get very different results. Sometimes perl can tell a pattern will fail even without running the alternation. Etc.

Having said that, I do wonder whether the TRIE compilation code should skip creating a trie when the alternation is simple with few branches.

(A quick technical overview for those interested: in something like (to|be|or|not|to|be), 5.8.x would try to match each word in turn, which is slow when there is a big list of words. a TRIE on the other hand, pre-computes a tree, so it knows the first letter must match one of b,n,o,t, and if the first letter matches t, the second must be o, etc. So the whole alternation is matched in a single pass, a character at a time, rather than going back and trying each word in turn.)

Dave.

  • Comment on Re: Is regcomp slower in 5.16.2 than in 5.8.8. How to speed things up ?

Replies are listed 'Best First'.
Re^2: Is regcomp slower in 5.16.2 than in 5.8.8. How to speed things up ?
by PerlBhikkhuni (Initiate) on May 29, 2013 at 17:03 UTC
    interesting..