George_Sherston has asked for the wisdom of the Perl Monks concerning the following question:

In my Camel the following appears on p 155, as an example of repeated global subsitution:
1 while s/(\d)(\d\d\d)(?!\d)/$1,$2)/;
I'm struggling to understand this. The purpose of the regex is to place commas before every third digit (counting backwards) - i.e. to format them nicely. And indeed, it does do this! But how?

At first I thought it might be a typo. If ! were a :
1 while s/(\d)(\d\d\d)(?:\d)/$1,$2)/;
then I'd understand what was going on - we're just saving time by not capturing the third cluster. But then the regex doesn't work. Another way to do it is
1 while s/(\d)(\d\d\d)(?:,|$)/$1,$2/;
Which does work, and in a way I understand. But what I still don't get is this ?! business. I can't find anything in the Camel to explain it. Is it negation? But then why? And why doesn't
1 while s/(\d)(\d\d\d)(?:\D)/$1,$2/;
work?

I'd be very interested to know what's going on...

§ George Sherston

Edit: Changed title, on author's request. larsen

Replies are listed 'Best First'.
Re: ?! idiom in regex - que?
by broquaint (Abbot) on Jun 05, 2002 at 11:15 UTC
    According to man perlre (?!) is ...
    A zero-width negative look-ahead assertion. For example "/foo(?!bar)/" matches any occurrence of "foo" that isn't followed by "bar". Note however that look-ahead and look-behind are NOT the same thing. You cannot use this for look-behind.
    So the regex is capturing a digit, followed by 3 digits which are not followed by a digit, got it? Yape::Regex::Explain has the following to say about the regex
    The regular expression: (?-imsx:(\d)(\d\d\d)(?!\d)) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- (?! look ahead to see if there is not: ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    HTH

    _________
    broquaint

Re: code(?!/code idiom in regex - que?
by marvell (Pilgrim) on Jun 05, 2002 at 11:23 UTC

    You might frind a read through the Extended Patterns section of the perlre manual page. But here is a brief outline ...

    /E(?!F)/ matches any E followed by anything other than an F. In a replacement, the advantage is that you don't have to deal with the matched bit. ie.

    $a = "this is a test\n"; $a =~ s/t(?!e)/T/g; print $a; $b = "this is a test\n"; $b =~ s/t([^e])/T$1/g; print $b;

    In your example, the code is looking for three digits, followed by "not a digit", but without having to mess about with replacing back in the "Łnot digit".

    --
    Steve Marvell