Re: Match zero times in regex

IP address matching is a bit tricky, so it's often helpful to turn to accumulated wisdom. Regexp::Common and Regexp::Common::net can help. (To see the trickiness, change one of the 255s in the example data of the OP to 256 – or even 666 or 66666 – and see the result using the OP regex.)

>perl -wMstrict -le
"use Regexp::Common qw(net);
 my $IPv4 = qr{ (?<! \d) $RE{net}{IPv4} (?! \d) }xms;
 ;;
 my @strs = (
   'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255',
   'permit ip 10.11.1.0  0.0.0.255 192.168.100.0 0.0.0.255',
   );
 ;;
 for my $s (@strs) {
   print qq{from: '$s'};
   while ($s =~ m{ (?<! host) \s+ ($IPv4) \s+ ($IPv4) }xmsg) {
     print qq{IP pair: '$1' '$2'};
     }
   }
"
from: 'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255'
IP pair: '192.168.100.0' '0.0.0.255'
from: 'permit ip 10.11.1.0  0.0.0.255 192.168.100.0 0.0.0.255'
IP pair: '10.11.1.0' '0.0.0.255'
IP pair: '192.168.100.0' '0.0.0.255'
[download]

Update: Unfortunately, this solution has a bug. See Re^2: Match zero times in regex for counter-example data demonstrating it.

Comment on Re: Match zero times in regex Download Code

Replies are listed 'Best First'.
Re^2: Match zero times in regex by ikegami (Patriarch) on Dec 13, 2011 at 04:26 UTC
Lookbehind was by first thought, but it doesn't work unless one changes what is currently allowed. `from: 'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255' IP pair: '192.168.100.0' '0.0.0.255' from: 'permit ip host 10.11.1.1 192.168.100.0 0.0.0.255' IP pair: '10.11.1.1' '192.168.100.0'` [download]	[reply] [d/l]
Re^3: Match zero times in regex by AnomalousMonk (Archbishop) on Dec 13, 2011 at 05:02 UTC
From the OP: ... if the entry is `permit ip``host 10.11.1.1` `192.168.100.0 0.0.0.255` I want to pull out `192.168.100.0 0.0.0.255` but if the entry is `permit ip``10.11.1.0` `0.0.0.255 192.168.100.0 0.0.0.255` I want to pull out `10.11.1.0` `0.0.0.255` `192.168.100.0 0.0.0.255` I don't see how an IP (10.11.1.1 in the example) after 'host' is ever desired to be captured. Am I missing something (wouldn't be the first time)?	[reply] [d/l] [select]
Re^4: Match zero times in regex by ikegami (Patriarch) on Dec 17, 2011 at 09:20 UTC
I don't see how an IP (10.11.1.1 in the example) after 'host' is ever desired to be captured. Exactly, yet I showed that your code does capture it.	[reply]
Re^5: Match zero times in regex by AnomalousMonk (Archbishop) on Dec 17, 2011 at 21:13 UTC
Re^2: Match zero times in regex by ricDeez (Scribe) on Dec 13, 2011 at 03:39 UTC
The negative look-behind is what the OP was trying to emulate with the {0}. This is definitely the way to go, using negative look-behind and negative look-aheads in the regex! This is a great solution, IMHO... Just a query: in your regexes, why did you use the m modifier?	[reply]
Re^3: Match zero times in regex by AnomalousMonk (Archbishop) on Dec 13, 2011 at 04:47 UTC
... why did you use the m modifier? This is in line with the recommendations of TheDamian's Perl Best Practices (PBP) for regexes. The /m regex modifier causes `^ $` regex operators also to match after/before embedded newlines. The invariable use of /m and the /s (dot-matches-all, including newlines) modifiers reduces the number of degrees of freedom enjoyed by these operators. In turn, this reduces potential maintenance headaches (I'll show you my scars sometime) and the general brain-hurt associated with regexes. The PBP recommendations in general and those for regexes in particular are controversial. (See especially BrowserUk for vigorous counter-argument; also, I think, the JavaFan.) I find many, perhaps most, of the recommendations to have compelling arguments in their favor and I rigorously (dare I say blindly?) use those pertaining to regexes.	[reply] [d/l]
Re^4: Match zero times in regex by ricDeez (Scribe) on Dec 13, 2011 at 07:38 UTC
Many thanks for your insights on this... I do have the PBP book and dust it off from time to time to make sure I am not straying from the path. I will make sure to look this recommendation up...	[reply]
Re^2: Match zero times in regex by SomeNetworkGuy (Sexton) on Dec 13, 2011 at 03:36 UTC
Thank you. I hadn't heard of Regexp::Common::net. I'll be looking into it.	[reply]