in reply to Re^5: regex doubt on excluding
in thread regex doubt on excluding

What am I missing?

I think rxrx :) pos , @- and @+

So it matched the zero length string, doesn't advance position, then matches one newline at same position thus advancing position, then it matches the zero length string again, and thats the end of matches

"a\n\n\nb" s(2)e(2)pos(2)len(0) ("a\n", "", "\n\nb") s(2)e(3)pos(3)len(1) ("a\n", "\n", "\nb") s(3)e(3)pos(3)len(0) ("a\n\n", "", "\nb")

I think that makes sense :)

#!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; for my $many ( 1..4 ){ my $s = "a\n".("\n" x $many )."\nb"; dd( $s ); while( $s =~ m{^(\s*?)$}gm ){ my $pos = pos( $s ); my $one = defined $1 ? $1 : ''; my $len = length $one; my $start = $-[0]; # @- my $lend = $+[0]; # @+ print "s($start)e($lend)pos($pos)len($len) "; dd( ( "".substr $s, 0, $pos - $len), ( "".substr $s, $pos-$len, $len ), ( "".substr $s, $pos ) ); } } __END__ "a\n\n\nb" s(2)e(2)pos(2)len(0) ("a\n", "", "\n\nb") s(2)e(3)pos(3)len(1) ("a\n", "\n", "\nb") s(3)e(3)pos(3)len(0) ("a\n\n", "", "\nb") "a\n\n\n\nb" s(2)e(2)pos(2)len(0) ("a\n", "", "\n\n\nb") s(2)e(3)pos(3)len(1) ("a\n", "\n", "\n\nb") s(3)e(3)pos(3)len(0) ("a\n\n", "", "\n\nb") s(3)e(4)pos(4)len(1) ("a\n\n", "\n", "\nb") s(4)e(4)pos(4)len(0) ("a\n\n\n", "", "\nb") "a\n\n\n\n\nb" s(2)e(2)pos(2)len(0) ("a\n", "", "\n\n\n\nb") s(2)e(3)pos(3)len(1) ("a\n", "\n", "\n\n\nb") s(3)e(3)pos(3)len(0) ("a\n\n", "", "\n\n\nb") s(3)e(4)pos(4)len(1) ("a\n\n", "\n", "\n\nb") s(4)e(4)pos(4)len(0) ("a\n\n\n", "", "\n\nb") s(4)e(5)pos(5)len(1) ("a\n\n\n", "\n", "\nb") s(5)e(5)pos(5)len(0) ("a\n\n\n\n", "", "\nb") "a\n\n\n\n\n\nb" s(2)e(2)pos(2)len(0) ("a\n", "", "\n\n\n\n\nb") s(2)e(3)pos(3)len(1) ("a\n", "\n", "\n\n\n\nb") s(3)e(3)pos(3)len(0) ("a\n\n", "", "\n\n\n\nb") s(3)e(4)pos(4)len(1) ("a\n\n", "\n", "\n\n\nb") s(4)e(4)pos(4)len(0) ("a\n\n\n", "", "\n\n\nb") s(4)e(5)pos(5)len(1) ("a\n\n\n", "\n", "\n\nb") s(5)e(5)pos(5)len(0) ("a\n\n\n\n", "", "\n\nb") s(5)e(6)pos(6)len(1) ("a\n\n\n\n", "\n", "\nb") s(6)e(6)pos(6)len(0) ("a\n\n\n\n\n", "", "\nb")

Replies are listed 'Best First'.
Re^7: regex doubt on excluding
by Athanasius (Archbishop) on Aug 16, 2014 at 06:25 UTC

    I’ve finally found some documentation which sheds light on this (and it’s only taken me 4 months!). From perlre#Repeated-Patterns-Matching-a-Zero-length-Substring:

    The higher-level loops preserve an additional state between iterations: whether the last match was zero-length. To break the loop, the following match after a zero-length match is prohibited to have a length of zero.

    I was wrong in thinking that the search position advances after a successful match. It does advance to the position immediately following the last match, but when that match was of zero length the “advance” is zero. But Perl’s regex engine prevents an infinite loop of zero-length matches by applying the rule quoted above.

    Thanks to Anonymous Monk for the useful analysis.

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,