Eimi,
That is good info and really helps, the only remaining problem is that the following regex still succeeds in matching 'A' in the example data I provided.
\G
^(?:int)\s+(\w+)\s*?\n
(?:\s*Number\sof\sFlaps:\s(\d+)\s*?\n)?
(?:\s*IP\sAddress:?\s([\d\.]+)\s*?\n)?
(?=^int|\Z)
Is there any way to pull it off where the whole regex fails that I am missing? | [reply] [d/l] |
You have a couple of options:
- You can do a match on the whole expression, and do the iterative (/g) match if the whole expression matches,
- You can accumulate the captures from the iterative match (on edit: using the /c option), and then test against /\G\Z/g before processing your way through them,
- You can lookahead the whole rest of the expression (I don't recommend this, because it duplicates a lot of effort compared to method 1).
# Option 1:
my $item_regex = qr/
(?:int)\s+(\w+)\s*?\n
(?:\s*Number\ of\ Flaps:\s(\d+)\s*?\n)?
(?:\s*IP\sAddress:?\s([\d\.]+)\s*?\n)?
/x;
while (<DATA>) {
if (/^$item_regex+\Z/) {
print "$1, $2, $3\n" while (/\G$item_regex/g);
}
}
Caution: Contents may have been coded under pressure.
| [reply] [d/l] [select] |
Thanks. I think I like the second approach best.
If I can, a bit of an elaboration though; of the 3 approaches you suggested, which is most efficient when there is not a match? I notice that when my complicated regex's do not match that I max the CPU on my server and the HTTP aspect has to time out. If the regex matches, it takes mere fractions of a second.
Is there a way within these approaches that I can minimize the wasted effort of the regex in a case that not all 'int's will match?
Thanks again.
| [reply] |