in reply to Re^3: Capturing string matched by regex
in thread Capturing string matched by regex

I agree with what you say regarding the flexibility of the (?i) construct over the global m{...}i one but I think I would take issue with your second paragraph.

... the unvarying use of the /xms regex modifier 'tail' (if that's the proper term) is to give the ^ $ . regex operators unvarying behaviors ...

While they are very rarely used that way, m, s and even x are no more invariant than i and can be sprinkled throughout your regular expression. To give a nonsense example:

knoppix@Microknoppix:~$ perl -E ' > $_ = qq{aabb\nwxy935TXB\n123}; > say $1 if m{(?x) ( a [^a] (?s) .* 9 (?-s) .* ) };' abb wxy935TXB knoppix@Microknoppix:~$
I'm also under the impression that (?i) need not be confined solely to the scope of capturing and non-capturing groups but can also be used as a "switch" to change the matching behaviour from the point at which it appears onwards or in its own "modifier" group ((?i:pattern)) for want of a better word. The following patterns are examples of how I understand the modifiers can be used:

m{(?i)Whole pattern case-insensitive} m(Case-Sensitive(?i)case-insensitive(?-i)Case-Sensitive Again} m{Case-Sensitive((?i)except in this capture)Case-Sensitive Again} m{Case-Sensitive(?i:but not here)Case-Sensitive Again} m{all case-insensitive(?-i:Except Here) and insensitive again}i m{(?x) Use white-space\sfor readability(?-x)but literal spaces now};

PBP is a fascinating book with very well argued recommendations that make you wonder whether you are doing things the right way. I believe that there are two equally valid reactions to each recommendation in the book: follow the recommendation if, after consideration, it seems better than what you were doing before; alternatively, if you can come up with equally cogent arguments for continuing the way you were, then do that. The main thing is that the book has made you think.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^5: Capturing string matched by regex
by AnomalousMonk (Archbishop) on Feb 18, 2012 at 02:41 UTC
    ... I think I would take issue with your second paragraph.
    ... the unvarying use of the /xms regex modifier 'tail' ...
    While they are very rarely used that way, m, s and even x are no more invariant than i ...

    What I meant to convey by my reference to "/xms-tail invariance" was that this is the PBP recommendation (original post amended) and that the reason for this is to nail down the behavior of the  ^ $ . critters. For this reason, I regard with horror the idea of sprinkling  (?-x) (?m) (?-s) et al through the regex due to the extreme danger of brain meltdown and subsequent containment breach. For those cases in which one might be tempted to the Dark Side, e.g., the use of  (?-s:.) in case an "anything-but-a-newline" match is needed (always assuming an /xms tail), PBP discusses alternatives; in the foregoing example,  [^\n] (or in 5.12+, the "experimental"  \N sequence).

    I'm also under the impression that (?i) need not be confined solely to the scope of capturing and non-capturing groups ...

    My discussion of the behavior of  (?pimsx-imsx) patterns was brief, vague and lacking. I've tried to remedy this with a link to the docs.

    The following patterns are examples of how I understand the modifiers can be used ...

    I haven't tested these, but they look syntactically correct. However, I would quibble with most, especially the latter ones, on stylistic grounds. I haven't time now, but may return to this point with a detailed discussion of my own preferences.

    PBP is a fascinating book with very well argued recommendations ...

    I agree with every statement in this paragraph.