in reply to Re: Back reference in s///g ?
in thread Back reference in s///g ?

Hi jwkrahn,

Your solution look interesting. However, after I google it I am not so understand the zero-width assertion (I never touch the extended thing). Would you like to explain it little more? Thank you

**For all,

Thank you for your replies. The magic capturing parentheses works! However, as far I know, the parentheses is usually used with include the string store in variable into a pattern. If the parentheses is the only way to do the inclusion, would it mean I will get the capturing effect? Also, Does capturing cost more in a regular expression matching?

Replies are listed 'Best First'.
Re^3: Back reference in s///g ?
by ikegami (Patriarch) on Mar 24, 2008 at 18:54 UTC

    The regex /(?<=[a-zA-Z])\n(?=[a-zA-Z])/ means "An newline, preceeded by /[a-zA-Z]/ and followed by /[a-zA-Z]/." The length of the match is one, starting at the newline.

    Contrast with /[a-zA-Z]\n[a-zA-Z]/ which means means "A /[a-zA-Z]/, then a newline, then a /[a-zA-Z]/." The length of the match is three, starting at the char before the newline.

    Regarding your capture question, there are optimization in place in some circumstances. I don't know if this is one of them. But honestly, if you need to micro-optimize that much, get familiar with the benchmarker.

Re^3: Back reference in s///g ?
by jwkrahn (Abbot) on Mar 24, 2008 at 23:01 UTC
    Does capturing cost more in a regular expression matching?

    According to perlre:

    WARNING: Once Perl sees that you need one of $&, $‘, or $’ anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1, $2, etc, so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression "(?: ... )" instead.) But if you never use $&, $‘ or $’, then patterns without capturing parentheses will not be penalized. So avoid $&, $’, and $‘ if you can, but if you can’t (and some algorithms really appreciate them), once you’ve used them once, use them at will, because you’ve already paid the price. As of 5.005, $& is not so costly as the other two.
    So there will be some minor overhead in using capturing parentheses.