in reply to Re^2: Warning about playing with matches
in thread Warning about playing with matches

And I am guessing that I would not want to use the sorted version I have listed above for performance reasons
With a trie it doesn't matter. Any (list|of|fixed|words), no matter how long or whether sorted or not will be pre-compiled into a trie. The only thing you should do is deduplicate them. To give you some idea of the performance difference, taking the pattern at the end of your posting and doing a repeated failed match against a long string:
my $r = qr/(?:u|uau|uauau| .... ufududufubudufufudufubudufu)/x; my $s = "a" x 1000000; $s =~ $r for 1..10;

On my laptop, this takes 27s on 5.8.9 and 0.024s on 5.10.0 and later.

perl 5.10.0 was released about 8 years ago. If you're going to do lots of matching against big word lists it would pay to upgrade to something newer than 5.8.x.

Dave.

Replies are listed 'Best First'.
Re^4: Warning about playing with matches
by Anonymous Monk on Oct 13, 2015 at 21:40 UTC
      I remember an upper limit for trie optimization, are you saying it was removed?
      Interesting, I did not know that. It appears that when the initial regex opnode list is constructed, if it has more than 65535 nodes (so the BRANCH nodes have to use LONGJMP nodes to continue) then the trie optimisation doesn't kick in. This is indeed still the case.

      Dave.