in reply to Regexp - match if not between [ ]

If your \w\d+ pattern can be substituted by \w\d{3}, then this seems to work:

$s = 'The fox did it[ at 12.23 ] well, Cf. 23 A423.23. The swallow was + even better,';; print for split '(?<!Cf)(?<!\w\d{3})\.(?![^\]]+])', $s;; The fox did it[ at 12.23 ] well, Cf. 23 A423.23 The swallow was even better,

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Regexp - match if not between [ ]
by Anonymous Monk on May 30, 2011 at 14:26 UTC

    Look-behind is problematic, for the number of digits etc. are not fixed. But thanks. I was really thinking in the wrong direction.

      Look-behind is problematic, for the number of digits etc. are not fixed.

      Look behinds can still accommodate the task, but it does get pretty unwieldy if the width variation is more than a few characters:

      print for split m[ (?<! Cf ) (?: (?<! \w\d\d\d ) | (?<! \w\d\d ) | (?<! \w\d ) ) \. (?! [ ^\] ]+ \] ) ]x, $s;; The fox did it[ at 12.23 ] well, Cf. 23 A423.23 The swallow was even better,

      But it sounds like you've settled on a solution.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.