Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

RE: RE (tilly) 1: Numeric list to optimised regexp

by ncw (Friar)
on Sep 07, 2000 at 12:54 UTC ( [id://31368]=note: print w/replies, xml ) Need Help??


in reply to RE (tilly) 1: Numeric list to optimised regexp
in thread Numeric list to optimised regexp

I didn't realise anyone had had a go at this sort of thing already, though it is inevitable really!

I'll anwser tilly's question with an example:-

My code gives (for the list 1..255)

[1-9]|(?:[1-9]|1\d|2[0-4])\d|25[0-5]
Whereas your code gives
((?:1(?:|0(?:|0|1|2|3|4|5|6|7|8|9)|1(?:|0|1|2|3|4|5|6|7|8|9)|2(?:|0| +1|2|3|4|5|6|7|8|9)|3(?:|0|1|2|3|4|5|6|7|8|9)|4(?:|0|1|2|3|4|5|6|7|8|9 +)|5(?:|0|1|2|3|4|5|6|7|8|9)|6(?:|0|1|2|3|4|5|6|7|8|9)|7(?:|0|1|2|3|4| +5|6|7|8|9)|8(?:|0|1|2|3|4|5|6|7|8|9)|9(?:|0|1|2|3|4|5|6|7|8|9))|2(?:| +0(?:|0|1|2|3|4|5|6|7|8|9)|1(?:|0|1|2|3|4|5|6|7|8|9)|2(?:|0|1|2|3|4|5| +6|7|8|9)|3(?:|0|1|2|3|4|5|6|7|8|9)|4(?:|0|1|2|3|4|5|6|7|8|9)|5(?:|0|1 +|2|3|4|5)|6|7|8|9)|3(?:|0|1|2|3|4|5|6|7|8|9)|4(?:|0|1|2|3|4|5|6|7|8|9 +)|5(?:|0|1|2|3|4|5|6|7|8|9)|6(?:|0|1|2|3|4|5|6|7|8|9)|7(?:|0|1|2|3|4| +5|6|7|8|9)|8(?:|0|1|2|3|4|5|6|7|8|9)|9(?:|0|1|2|3|4|5|6|7|8|9)))
My aim was to get rid of as many alternations as possible (which are slow) and turn them into character classes (which are fast). I wanted also to factor the regexp as much as possible.

If you change my code replacing all \d's with \w or whatever it should work fine for any list of words, but I designed and tested it with numeric lists in mind.

My first attempt at this problem used a trie like data structure but I abandonded it once I had the idea of using backtracking regexps - the irony of using regexps to optimise regexps was irresistable!

Replies are listed 'Best First'.
RE (tilly) 3 (nice): Numeric list to optimised regexp
by tilly (Archbishop) on Sep 07, 2000 at 14:29 UTC
    OK, this is nice. OTOH I really want to see the win moved down to the RE engine, and at least one optimization that was discussed with Ilya would move all of the wins from both of our approaches down.

    So someday you should see all matches speed up because of this kind of logic, without having to do any work for it... :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://31368]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-23 22:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found