in reply to speeding up a regex

To load up keywords from a file you would need to do something like this:

my @patterns; open( my $fh, "input-file.txt") or die "Cannot open input file:$!"; while( my $line = <$fh> ) { chomp( $line ); push( @patterns, qr/\b$line\b/ ); } close($fh);

Your regexes are already fairly speedy by using the qr// operator to precompile them. I use this same method to look for roughly 72,000 keywords in thousands of 5-20k full text documents at NewsCloud and it can process a single full text article in under a second.

One thing that will make your life easier is to start using foreach loops instead of for loops like you are. Much less typing and confusing. Remember, you aren't using C anymore :)

Frank Wiles <frank@wiles.org>
http://www.wiles.org