in reply to non-capture mode sometimes erases previous capture

the documentation for /n is a little sketchy, saying only "Non-capture mode. Don't let () fill in $1, $2, etc..."

Then you're looking at perlop or perlreref - the central regexp documentation, perlre, is more specific. While it does include the IMO misleading "This modifier ... will stop $1, $2, etc... from being filled in", it goes on to say

This is equivalent to putting ?: at the beginning of every capturing group

... which I hope clarifies the situation. If you write /([aeiou])(.)/; /([aeiou])(?:.)/;, do you expect $2 to keep its value from the first match? (I hope not, because it doesn't :-) ) Your regexes are the equivalent of /(..)(..)(..)/; /(?:j)/; /(?:g)/;. I hope it's becoming clear that the clearing of the match variables is pretty logical when you look at it this way.

The rule Corion named still applies: Only rely on the values of $1-$N and the other special regex variables immediately after the successful pattern match that set them. Although in some programs, they may retain their value for a long time if you don't run another regex in the same scope, I would still strongly recommend against using them for more than a couple of lines after the regex - it's too easy to overlook when editing the code later, and someone may insert a second regex after the first one.

As BrowserUk already said, if you want to keep match variables, the only reliable way to do so is by making a copy.

BTW, if you're doing complex stuff with regexes, you may want to look into named capture groups and the %+ variable (which you also have to make a copy of if you want to keep it).

Replies are listed 'Best First'.
Re^2: non-capture mode sometimes erases previous capture
by raygun (Scribe) on Jun 01, 2018 at 18:00 UTC

    Thank you all for the responses. I'm convinced this is working as designed, not entirely convinced that the design is ideal, and fully convinced that the design won't change at this stage.

    In addition to the sentence in perlre that haukex quotes, that document also says, "Capture group contents are ... available to you outside the pattern until the end of the enclosing block or until the next successful match, whichever comes first." Lesson: read documentation more thoroughly before posting.

    It would be nice for s/// to have an option that means "don't capture and don't alter the existing values of $1, $2, etc." And it wouldn't break much in terms of back-compatibility if /n were this option: with the current behavior being to undefine all these variables when /n is used, extant code can't be relying on them for anything after a /n substitution. (I suppose some code might rely on them being undefined, but that seems a rare edge case.) Still, at the end of the day, if it ain't broke, don't fix it. There are other ways to save $1, so this hypothetical option might be an occasional convenience but certainly isn't a necessity.

    I would still strongly recommend against using them for more than a couple of lines after the regex
    Completely agree. I encountered this issue when using the variable in the target of the substitution, in an eval that ran another simple substitution. So it wasn't really any lines after the regex that populated $1.