in reply to pattern matching with large regex
A little more detail on what you're doing with the code would be helpful. For example, are you just testing for existance, or are you extracting pieces of data? Are these regular expressions using metacharacters such as .*[]?, or are they constant strings?
Each of these answers may help us help you optimise your code appropriately. For example, constant strings generally are faster with index than regular expressions. But if you have thousands, and you use a regular expression optimizer of some sort from CPAN, you may be able to get a reasonable state machine for finding your data.
On the other hand if you're trying to extract data, which I kind of doubt, and your regular expressions actually use regexp metacharacters, you're probably best off looping through the list:
Here we precompile each one, and then try each one after another. The compiled regular expressions should execute a bit faster - I'm not sure why, but I'm guessing because the state machine is way simpler. Note that if you only check a single chunk of text, you won't save anything by pre-compiling the regular expressions.my @regexps = load_regexps(); @regexps = map { qr/$_/ } @regexps; # pre-compile 'em all. foreach my $re (@regexps) { if ($text =~ $re) { # do stuff based on match. } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: pattern matching with large regex
by Anonymous Monk on Aug 13, 2005 at 21:32 UTC | |
by Tanktalus (Canon) on Aug 13, 2005 at 23:23 UTC | |
by Anonymous Monk on Aug 14, 2005 at 07:43 UTC | |
by lidden (Curate) on Aug 13, 2005 at 21:46 UTC | |
by Anonymous Monk on Aug 13, 2005 at 22:10 UTC |