UPDATE:
Thanks to tye, Corion and diotalevi for algorithm assistance in pruning a list currently being iterated over.use Getopt::Std; use vars qw'$opt_f $opt_P $VERSION'; $VERSION = 0.08; unless( @ARGV ){ print<<EOUsage; usage: vfgrep [-P] -f PATTERNS [FILE] -P prune patterns on match, if a pattern will only match once per file this can speed up execution by discarding the pattern once it has m +atched. EOUsage exit; } getopts('Pf:'); die("vfgrep requires a pattern file\n") unless defined $opt_f; open(PAT, $opt_f) or die("Error opening pattern file '$opt_f': $!\n"); #Strip any sort of EOL garbage #Also pre-compile the regexp for a major performance boost my @PAT = map{ y/\r\n//d; qr/$_/ } <PAT>; close(PAT); if( $opt_P ){ REC: while(<>){ #Premature optimization? last unless scalar @PAT; foreach my $i (0 .. $#PAT ){ splice(@PAT, $i, 1) && next REC if /$PAT[$i]/; } print; } print <> unless eof; #Pairs with last unless } else{ REC: while(<>){ foreach my $pattern ( @PAT ){ next REC if /$pattern/; } print; } } __END__ Todo: mmap?
Note: In my tests case (what I wrote this for) pruning cut the run time by 2/3.
--
In Bob We Trust, All Others Bring Data.
|
|---|