in reply to How to enforce match priority irrespective of string position

Finally, a "sort of" test case :)

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11129253 use warnings; local $_ = <<END; Point 1.3.4: A piece of text. Point 1.3.5: A piece of text. Point 1.3.6: Another piece of text. Point 1.3.6: For some reason this +piece of text isn't finished yet. Point 1.3.6: In fact, this piece of text even broke into a new line. Point 1.3.7: Finally, a new piece of text. END my @parts; push @parts, $& while / (Point\s[\d.]+:) .*? (?=Point|\z) (?!\1) /gsx; use Data::Dump 'dd'; dd \@parts;

Outputs four chunks, just like you asked for:

[ "Point 1.3.4: A piece of text.\n\n", "Point 1.3.5: A piece of text.\n\n", "Point 1.3.6: Another piece of text. Point 1.3.6: For some reason th +is piece of text isn't finished yet.\n\nPoint 1.3.6: In fact, this pi +ece of text even broke into a new line.\n\n", "Point 1.3.7: Finally, a new piece of text.\n\n", ]

Replies are listed 'Best First'.
Re^2: How to enforce match priority irrespective of string position
by Polyglot (Chaplain) on Mar 08, 2021 at 01:33 UTC

    And that method worked! (Though I've had to restructure a bit to accommodate, as that was not in a simple substitution form.) I don't mind doing whatever is necessary to get things working, though...so thank you very much! I'll certainly upvote this when I get my next day's rations.

    This part seems to be the crucial bit: (?=Point|\z) (?!\1). I find this sort of syntax confusing because it always seems to me that the "Point" here should have precedence over anything coming afterward in the regex sequence, in this case the "\1" backreference. If "Point" is already detected from the forward assertion, why can it be matched again (overlapped) by this reference, even if in the negative?

    Well, no complaints at the moment, certainly, as at least the script is now past this hurdle. Thank you.

    Blessings,

    ~Polyglot~

      Because (?= and (?! are ZERO-WIDTH assertions.

        I appreciate knowing that, but while the assertion may be "zero-width," my mind still stumbles on the point that "Point" is certainly not zero-width. I'd always understood the "zero-width" aspect to be more related to the capturing and positioning of the match within the string. Are all look-arounds zero-width? If so, why must a look-behind be always of a specified length (width) that is not variable?

        Well, it may be that it's just too abstract for me.

        Thank you for your explanation.

        Blessings,

        ~Polyglot~