in reply to Re: Capturing brackets within a repeat group
in thread Capturing brackets within a repeat group

Yup! That's essentially the solution I arrived at here, although the \G in your version isn't doing anything in this case. I believe (but am open to correction) that \G doesn't have any effect unless you also use the /c modifier and even then, it only has an effect once a failure has occurred in which case, a subsequent match on the same target string will start from the point of the previous failure.

The reason for wanting the capturing group withing a repeat count to work in the way I described was that it would allow the s/// used in the above reference to only effect the substitution on the target string if the format of the target string exactly matched the regex.

Your regex will happily match 'ff:ff' or 'ff:ff:ff:ff:ff:ff:ff:ff:ff:' as you are aware, which is why you are checking the size of the array afterwards. Thats ok, but in the case where you want to modify the target using the s/// operator, it forces you to match & capture, test and then modify *IF* the number of matches is correct

my $mac = 'ff:ff:ff:ff:ff:ff'; if (6 == ($_ = () = $mac =~ /([0-9A-Z]{1,2})(?::|$)/ig) ){ $mac =~ s/([0-9A-Z]{1,2})(?::|$)/ substr "0$1", -2 /ieg; }

in order that you ensure that you only modify the target if it actually conforms to the required format.

That makes for a hell of a lot more work, redundancy, needless capturing and duplication than it would if the repeat group repeated the capture group as well.

It might then look like this:

my $mac = 'ff:ff:ff:ff:ff:ff'; $mac =~ s[^ (?: ( [0-9A-Z]{1,2} ) : ){5} ( [0-9A-Z]{1,2} ) $] [ sprintf '%02s' x 6, $1, $2, $3, $4, $5, $6 ]iex;

No need for the redundant capturing, duplicated matching, nor even to test as the substitution will only occur if the target matches the pattern exactly.

I think that John M. Dlugosz hit the nail on the head. The best way of acheive my aim is to use the x operator to build the regex then compile it with qr// like this.

my $re_mac = '(?: ( [0-9A-Z]{1,2} ) : )' x 5 . '( [0-9A-Z]{1,2} )'; $re_mac = qr[$re_mac]ix; .... $mac =~ s[^ $re_mac $] [ sprintf '%02s' x 6, $1, $2, $3, $4, $5, $6 ]e +x;

That satisfies my desire to avoid redundancy whilst only performing the substitution if the tightly specified regex is matched exactly. If I need to know whether the substitution occured, I can simply test its return.

The main reason for the SoPW was simply that this was the first time I had ever tried to apply a repeat count to a capture group and when it didn't work the way my instincts told me it would, I tried to look up the description that explained the behaviour, and came up short. I'm still not entirely convinced that the passage that Arien cites is an explanation for the behaviour. Given the context of the passage, it seems entirely disparate from the usage I am describing. However, if it was in the authors mind to cover both situations in that short passage, then I think this is a case where a few more words, or perhaps a second short paragraph to seperate and clarify the two would have benefited.


Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Replies are listed 'Best First'.
Re3: Capturing brackets within a repeat group
by Hofmator (Curate) on Jan 12, 2003 at 15:44 UTC
    although the \G in your version isn't doing anything in this case. I believe (but am open to correction) that \G doesn't have any effect unless you also use the /c modifier and even then, it only has an effect once a failure has occurred in which case, a subsequent match on the same target string will start from the point of the previous failure.

    then let me correct ;-)

    The '\G' forces the next match to start where the last ended. When the regex is executed the first time, '\G' is thus equivalent to '\A' (beginning of string). The next matches (due to the /g modifier) have to start where the previous one ended, so no part of the string can be skipped. This sure makes a difference, see the examples below.

    sub test_regex { local $_ = shift; local $\ = "\n"; print 'string: ', $_, ; print 'with \G: ', join(':', m/\G ( [0-9A-Z]{1,2} ) (?: :|$ ) / +igx); print 'without \G: ', join(':', m/ ( [0-9A-Z]{1,2} ) (?: :|$ ) / +igx), "\n"; } test_regex('0:0A:0C:B:B8:F'); test_regex('#0:0A:0C:B:B8:F'); test_regex('0: 0A:0C:B:B8:F'); test_regex('0:0Aa:0C:B:B8:F');

    -- Hofmator

      ...and thankyou for the correction.

      I've only recently begun exploiting the possibilities of \G and /gc, and have found the docs woefully lacking on good examples. Yours is the clearest explanation/ demonstration of the \G assertion I've seen.

      It should be in the manual as far as I'm concerned.


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

        Thanks for the praise :)

        But do you know about the new documentation that comes with perl 5.8.0: perlrequick and perlretut?? The part concerning the \G anchor can be found in this section - and I think it's a quite good description.

        -- Hofmator