Just because TIMTOWTDI, I put together a version using the qr// operator. I had guessed it would be more efficient than the /o version, but such is not the case:
use Benchmark;
push @lines, ((' ' x int rand 12) . (int rand 60000) . (' ' x int rand
+ 2)) for 1..10000;
@pattern_list = qw(12345 12346 20034 8787 31337 31338 54320 54321);
my $bad_ports = '\b(?:12345|12346|20034|8787|31337|31338|54320|54321)\
+b';
timethese( 100, {
'qr' => 'with_qr',
'/o' => 'with_o'
});
sub with_qr {
foreach $pattern (@pattern_list) {
my $re = qr/\b${pattern}\b/;
foreach $line (@lines) {
$line =~ /$re/;
}
}
}
sub with_o {
my $found = 0;
for(@lines){
(/\b(?:$bad_ports)\b/o);
}
}
with these results:
Benchmark: timing 100 iterations of /o, qr...
/o: 7 wallclock secs ( 6.88 usr + 0.00 sys = 6.88 CPU) @ 14
+.53/s (n=100)
qr: 20 wallclock secs (18.84 usr + 0.00 sys = 18.84 CPU) @ 5
+.31/s (n=100)
Granted, this is a poor dataset, and mileage will probably vary for a real logfile.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.