Re: Match zero times in regex
by AnomalousMonk (Archbishop) on Dec 13, 2011 at 02:08 UTC
|
IP address matching is a bit tricky, so it's often helpful to turn to accumulated wisdom. Regexp::Common and Regexp::Common::net can help. (To see the trickiness, change one of the 255s in the example data of the OP to 256 – or even 666 or 66666 – and see the result using the OP regex.)
>perl -wMstrict -le
"use Regexp::Common qw(net);
my $IPv4 = qr{ (?<! \d) $RE{net}{IPv4} (?! \d) }xms;
;;
my @strs = (
'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255',
'permit ip 10.11.1.0 0.0.0.255 192.168.100.0 0.0.0.255',
);
;;
for my $s (@strs) {
print qq{from: '$s'};
while ($s =~ m{ (?<! host) \s+ ($IPv4) \s+ ($IPv4) }xmsg) {
print qq{IP pair: '$1' '$2'};
}
}
"
from: 'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255'
IP pair: '192.168.100.0' '0.0.0.255'
from: 'permit ip 10.11.1.0 0.0.0.255 192.168.100.0 0.0.0.255'
IP pair: '10.11.1.0' '0.0.0.255'
IP pair: '192.168.100.0' '0.0.0.255'
Update: Unfortunately, this solution has a bug. See Re^2: Match zero times in regex for counter-example data demonstrating it.
| [reply] [d/l] |
|
|
Lookbehind was by first thought, but it doesn't work unless one changes what is currently allowed.
from: 'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255'
IP pair: '192.168.100.0' '0.0.0.255'
from: 'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255'
IP pair: '10.11.1.1' '192.168.100.0'
| [reply] [d/l] |
|
|
| [reply] [d/l] [select] |
|
|
|
|
|
|
The negative look-behind is what the OP was trying to emulate with the {0}. This is definitely the way to go, using negative look-behind and negative look-aheads in the regex! This is a great solution, IMHO...
Just a query: in your regexes, why did you use the m modifier?
| [reply] |
|
|
... why did you use the m modifier?
This is in line with the recommendations of TheDamian's Perl Best Practices (PBP) for regexes. The /m regex modifier causes ^ $ regex operators also to match after/before embedded newlines. The invariable use of /m and the /s (dot-matches-all, including newlines) modifiers reduces the number of degrees of freedom enjoyed by these operators. In turn, this reduces potential maintenance headaches (I'll show you my scars sometime) and the general brain-hurt associated with regexes.
The PBP recommendations in general and those for regexes in particular are controversial. (See especially BrowserUk for vigorous counter-argument; also, I think, the JavaFan.) I find many, perhaps most, of the recommendations to have compelling arguments in their favor and I rigorously (dare I say blindly?) use those pertaining to regexes.
| [reply] [d/l] |
|
|
|
|
Thank you. I hadn't heard of Regexp::Common::net. I'll be looking into it.
| [reply] |
Re: Match zero times in regex
by ikegami (Patriarch) on Dec 13, 2011 at 00:30 UTC
|
It does match zero times. In your code, (?:host){0} is matching "host" zero times starting at position 14.
1 2 3 4
012345678901234567890123456789012345678901234567
permit ip host 10.11.1.1 192.168.100.0 0.0.0.255
What about
if (
my ($pairs) = $entry =~
/^ \s* permit \s+ ip \s+ (?: host \s+ \S+ \s+ )? (.*)/x
) {
while ( $pairs =~ /(\S+) \s+ (\S+)/xg ) {
my ($ip, $mask) = ($1, $2);
... $ip ... $mask ...
}
}
Update: Fixed problem with solution.
| [reply] [d/l] [select] |
|
|
while( $pairs =~ m{(\S+) \s/ (\S+)}xg ) {
my( $ip, $mask ) = ( $1, $2 );
}
See Re: Surprise: scalar(($x, $y) = split) | [reply] [d/l] |
|
|
Thanks, fixed. I hate having to use $1 and $2, but forgot that it's required here.
However, the linked node has no bearing on my mistake. The linked node is about the result of list assignment in scalar context, while my mistake was desiring the scalar context behaviour of m//g while calling it in list context.
PS — Mini-Tutorial: Scalar vs List Assignment Operator is much more comprehensive about the behaviour of assignment based on context.
| [reply] [d/l] [select] |
|
|
Thanks, I see now why my regex wasn't working the way I wanted it to.
| [reply] |
Re: Match zero times in regex
by JavaFan (Canon) on Dec 13, 2011 at 00:21 UTC
|
$_ = "permit ip 10.11.1.0 0.0.0.255 192.168.100.0 0.0.0.255";
while (/(?:host){0}\s+(\d+\.\d+\.\d+\.\d+)\s+(\d+\.\d+\.\d+\.\d+)/g){
say "$1 $2";
}
__END__
10.11.1.0 0.0.0.255
192.168.100.0 0.0.0.255
I copy-and-pasted your regexp.
I don't why you'd use (?:host){0} though, it doesn't add anything. | [reply] [d/l] [select] |
|
|
Now try the other example he gave.
| [reply] |
Re: Match zero times in regex
by vinian (Beadle) on Dec 13, 2011 at 00:35 UTC
|
maybe this is what you want
(?:host){0,1} or (?:host)? both match zero or one time
Life is all about making decisions. Stop or go, shake or bake, plea bargain or go to trial... without the ability to make decisions, nothing would ever get done.
| [reply] [d/l] [select] |
Re: Match zero times in regex
by TJPride (Pilgrim) on Dec 13, 2011 at 09:41 UTC
|
use strict;
use warnings;
my $mask = '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}';
while (<DATA>) {
chomp;
if (@_ = m/permit ip(?: host)?\s+($mask)\s+($mask)\s+($mask)(?:\s+
+($mask))?$/) {
no warnings 'uninitialized';
if ($2 eq '192.168.100.0' && $3 eq '0.0.0.255' ||
$3 eq '192.168.100.0' && $4 eq '0.0.0.255') {
print "@_\n";
}
}
}
__DATA__
permit ip host 10.11.1.1 192.168.100.0 0.0.0.255
permit ip 10.11.1.0 0.0.0.255 192.168.100.0 0.0.0.255
| [reply] [d/l] |