Some improvements. Notes:
- %ignore now holds compiled patterns.
- Counts are now in %ignored.
- glob2pat should be faster by processing multiple characters at once, but that shouldn't affect your run time.
- Removed $do_not_print to reduce clutter. A label is used on the while loop instead.
- In the second snippet, Regexp::Assemble is used to factor out the "^" and "$" which is common to all ignore regexps.
- The chomp is probably unnecessary.
- If you only use $col[0] of @cols, replace
my( @cols ) = split(" ", $_);
with
my( $col0 ) = /^\s*(\S+)/;
{
my %patmap = (
'*' => '.*',
'?' => '.',
'[' => '[',
']' => ']',
);
my ($norm) = map qr/[^$_]/, join '', keys %patmap;
sub glob2pat {
my $globstr = @_ ? $_[0] : $_;
$globstr =~ s{($norm+|.)} { $patmap{$1} || "\Q$1" }sge;
return "^$globstr$";
}
}
my %ignore = map { $_ => qr/$_/ }, map glob2pat, @list_regexps;
my %ignored;
LINE:
while(<FILE>) {
chomp;
my( @cols ) = split(" ", $_);
...
for my $re_name ( keys %ignore ) {
my $re = $ignore{$re_name};
if ( $cols[0] =~ $re ) {
$ignored{$re_name}++;
next LINE;
}
}
...
}
If you can live without %ignored,
use Regexp::Assemble qw( )
{
my %patmap = (
'*' => '.*',
'?' => '.',
'[' => '[',
']' => ']',
);
my ($norm) = map qr/[^$_]/, join '', keys %patmap;
sub glob2pat {
my $globstr = @_ ? $_[0] : $_;
$globstr =~ s{($norm+|.)} { $patmap{$1} || "\Q$1" }sge;
return "^$globstr$";
}
}
my $ignore_re = do {
my $ra = Regexp::Assemble->new();
$ra->add(glob2pat($_)) for @list_regexps;
$ra->re()
};
while(<FILE>) {
chomp;
my( @cols ) = split(" ", $_);
...
next if $cols[0] =~ $ignore_re;
...
}
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.