Re^2: Comparing pattern

@bv
Look at study, especially if you have a lot of patterns you are matching against.
I have added study; for both sentences, but I can't see any difference when scanning multiple files using this subroutine. Please advice.

#!/usr/bin/perl -w

use strict;

my $patterns = "/path/to/patterns.txt";
my $arg1 = shift;

open (PAT, '<', $patterns) or die "$patterns: $!\n";
my @patterns = <PAT>;
study;
close(PAT);
chomp @patterns;

my $regex_string = join '|', @patterns;

open( FILE, "<", "$arg1") or die "$arg1: $!\n";
$_ = do { local $/; <FILE> };
study;
close(FILE);

if ( /($regex_string)/is ) {print "\n$arg1\n$1\n";}
[download]

Comment on Re^2: Comparing pattern Download Code

Replies are listed 'Best First'.
Re^3: Comparing pattern by bv (Friar) on Sep 21, 2009 at 15:22 UTC
Did you read the documentation on study? `study` attempts to make matches against a string more efficient, but incurs a one-time penalty for the time spent studying the string. It is most beneficial when you are doing many matches against a single string. You should benchmark to determine if you are getting any benefit from study. The first `study` in your code (line 10) is unnecessary, since you don't have a string in $_ to match against. You keep saying "subroutine." Is this really in a sub? If so, are you reading in your patterns every time the sub is run? There's a major inefficiency. And once you solve that one, you can look at precompiling your expressions like I originally suggested. `print pack("A25",pack("V*",map{1919242272+$_}(34481450,-49737472,6228,0,-285028276,6979,-1380265972)))`	[reply] [d/l] [select]
Re^4: Comparing pattern by mrc (Sexton) on Sep 21, 2009 at 17:54 UTC
Yes, it is a subroutine. I need this script to scan all files for scams or other abuses. I'm using File::Find to search all files under a directory tree then call the subroutine for each file. Patterns are outside the sub. Based on your suggestion, I will precompile this way: `... my $list_regex = join '\|', @patterns; my $regex_string = qr/$list_regex/is; ... if (/($regex_string)/) {print "\n$arg1\n$1\n";}` [download] As for study, I noticed a little slowness. Maybe it's not efficient in my case. I still have a big problem. Graff helped me with file slurp and scanner working few times faster than my original script, but I don't have experience with $/ or $_ and, if you check my last example, global $1 contains entire text between first pattern and second pattern: `pattern1 some text pattern2 instead of this match: pattern1.*pattern2` [download] Can you please give me some advice? Thank you!	[reply] [d/l] [select]