Jaap has asked for the wisdom of the Perl Monks concerning the following question:

Wise Monks,

The code below:
my $testVar = <<EOF; fasdfasdf asdfasdfasdf asdfasdfasdf asdfasdfasdf asdfasdfasdf asdfasdfasdf sdfasdfasdf EOF while ($testVar =~ m/(.*?)\n\n/sg) { print 0; }
prints just one 0. When i remove the s operator, it returns three 0's as expected.
The s operator is supposed to make the . match newlines too, right? Then why does it match only once?

Replies are listed 'Best First'.
Re: strange behaviour of /regexp/sg
by hv (Prior) on Mar 21, 2003 at 17:25 UTC

    This is a known bug in perl-5.6.1 - the regexp optimiser sees the initial ".*" and treats it as if it were anchored to the start of the string (which is fine), but fails to take into account the /g, which means that when the second match attempt sees that you are not at the start of the string it immediately aborts the match.

    Small variations are enough to avoid the optimiser's error: anchoring the pattern with \G is one way:

    $testVar =~ /\G(.*?)\n\n/sg
    and inserting an empty non-capture is another:
    $testVar =~ /(?:)(.*?)\n\n/sg

    Hugo
      vote++
      Thank you very much.
Re: strange behaviour of /regexp/sg
by dga (Hermit) on Mar 21, 2003 at 16:44 UTC

    When I ran this it prints 3 zeros each time. Printing out what matched is different but the number of matches remained 3.

    Update: My version was 5.6.0

Re: strange behaviour of /regexp/sg
by diotalevi (Canon) on Mar 21, 2003 at 16:44 UTC

    I'd have written that with \G but it works (meaing '000' is printed) just fine on AS perl 5.6.1 build 633. Which platform are you using?

      I tried both SiePerl 5.6.1 on winnt4 and Perl 5.6.1 on Solaris
Re: strange behaviour of /regexp/sg
by Jaap (Curate) on Mar 21, 2003 at 16:37 UTC
    Hey, this only happens in perl 5.6.1
    Perl 5.8.0 does print three 0's with the s operator. Is this a known bug?
Re: strange behaviour of /regexp/sg
by Jaap (Curate) on Mar 21, 2003 at 17:13 UTC
    Now i have no way to match the blocks. I tried replacing '.' with '[.\n]' but that doesn't work. I also added \G to the beginning but that didn't do much.

    How can i match the blocks?