This does not address the scalability issue, but it will speed things up. If most of the lines are being rejected by the first few REs, it might speed it up a lot.LINE: foreach my $line (@SYSLOG) { foreach my $r (@REGEXPS) { next LINE if $line =~ /$r/; } push @KEEP, $line; }
Beyond that, if you had simpler regular expressions you could try tricks like trieing them together to stop the RE engine from doing so much redundant work. Unfortunately that will be hard with the examples that you gave. However you might take some of your complex REs which are closely related and find ways to combine them...
Incidentally I see that a lot of the work being done by your REs looks like you are parsing the syslog format for various specific strings. If that is so, then you could parse each line into a small data structure and then make what you are currently doing by an RE match be doable in a simpler fashion.
However trying to do anything fancy may lose on overhead. You pretty much have to try it and see.
But in the end if you want to do lots of arbitrary checks against lots of arbitrary strings, lots of work will need to be done.
In reply to Re: Removing all lines from an array which match against a second array
by tilly
in thread Removing all lines from an array which match against a second array
by skx
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |