in reply to Re^2: Regex: Capturing and optionally replacing
in thread Regex: Capturing and optionally replacing
Based on the variable name, I thought you were only printing the host without the domain.
And when you said "skip it", I thought you meant the line, not the domain.
while (<DATA>) { my ($host, $domain, $test) = /^([^,.]+)(,[^.]+|)\.(.+)$/; $domain =~ s/,/./g; $domain = '' if /\.net$/; print("Host:$host$domain Test:$test\n"); }
Turns out
$domain =~ s/^.*\.net$//;
takes the same amout of time as
$domain = '' if substr($domain, -4) eq '.net';
but both are slightly slower than
$domain = '' if /\.net$/;
Update: 17% faster:
while (<DATA>) { chomp; my ($host, $test) = split(/\./, $_, 2); $host =~ s/,/./g; $host =~ s/\..*\.net$//; print("Host:$host Test:$test\n"); }
Update: If I change
$host =~ s/\..*\.net$//;
to
$host =~ s/\.my-domain\.net$//;
my version is 6% faster than yours (with $host =~ s/,/./g; added).
Update: Benchmark code
use strict; use warnings; use Benchmark qw( cmpthese ); my @data = <DATA>; sub test_m { my @rv; foreach (@data) { local $_ = $_; # while (<DATA>) my ($host, $test) = / ( # Start first capture [\w\-]+ # One or more alphanum or hyphens (?: # non-capturing lookahead ,my-domain,com # Literal string )? # Make it optional ) # End of first capture (?: # non-capturing lookahead [\w\-,]+ # One or more alpanum or hyphens )? # Make it optional \. # A literal period ( # Start second capture [a-z]+ # One or more lowercase chars ) # End second capture /x or next; $host =~ s/,/./g; push(@rv, "Host:$host Test:$test\n"); } @rv; } sub test_i { my @rv; foreach (@data) { local $_ = $_; # while (<DATA>) chomp; my ($host, $test) = split(/\./, $_, 2); $host =~ s/,/./g; $host =~ s/\.my-domain\.net$//; push(@rv, "Host:$host Test:$test\n"); } @rv; } sub test_i2 { my @rv; foreach (@data) { local $_ = $_; # while (<DATA>) my ($host, $test) = /([^,.]+(?:,my-domain,com)?)[^.]*\.(.+)/x or next; $host =~ s/,/./g; push(@rv, "Host:$host Test:$test\n"); } @rv; } print("m:\n"); print test_m(); print("--\n"); print("i:\n"); print test_i(); print("--\n"); print("i2:\n"); print test_i2(); cmpthese(-2, { m => \&test_m, i => \&test_i, i2 => \&test_i2, }); __DATA__ hosta-sel-kr-1,my-domain,net.testa hostb-sel-kr-1,my-domain,net.testb hostc-sel-kr-1,my-domain,com.testa hostd-sel-kr-1,my-domain,com.testc hoste-sel-kr-1,my-domain,net.testxyz hosta-mel-au-1,my-domain,net.testabc hosta-mel-au-1,my-domain,net.testdef hostxyz.testabc someotherhost.someothertest
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Regex: Capturing and optionally replacing
by McDarren (Abbot) on Dec 08, 2005 at 17:13 UTC | |
|
Re^4: Regex: Capturing and optionally replacing
by McDarren (Abbot) on Dec 09, 2005 at 00:34 UTC | |
by ikegami (Patriarch) on Dec 09, 2005 at 06:18 UTC |