in reply to Is your Perl broken? Bug help needed.

Here's the latest...

I downgraded one of my servers to run 5.6.0, and the problem goes away. Perhaps this is drastic, but to avoid problems I'm going to downgrade all of my servers to 5.6.0 until the problem is officially addressed in a later release.

  • Comment on Re: Is your Perl broken? Bug help needed.

Replies are listed 'Best First'.
Re: Re: Is your Perl broken? Bug help needed.
by japhy (Canon) on Jan 10, 2002 at 03:29 UTC
    5.6.0 has far more bugs. And you can get around the bug by using \C instead of /./s (so long as you aren't working with Unicode). And if the bug has been fixed in bleadperl, it'll be fixed for 5.6.2.

    Update: Hmm. Well then, try (?:.|\n) or [\000-\377]. Maybe?

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      While I wouldn't advocate moving back to 5.6.0, I don't think \C avoids this particular bug...
      % perl -le 'print "($1) " while "this_is_broken_" =~ /(\C*?)_/sg' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /(\C*?)_/g' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg' (this) % perl -mre=debug -le 'print "($1) " while "this_is_broken_" =~ /(\C*? +)_/sg' Freeing REx: `,' Compiling REx `(\C*?)_' size 10 first at 5 1: OPEN1(3) 3: MINMOD(4) 4: STAR(6) 5: SANY(0) 6: CLOSE1(8) 8: EXACT <_>(10) 10: END(0) floating `_' at 0..2147483647 (checking floating) anchored(SBOL) impli +cit minlen 1 Guessing start of match, REx `(\C*?)_' against `this_is_broken_'... Found floating substr `_' at offset 4... Guessed: match at offset 0 Matching REx `(\C*?)_' against `this_is_broken_' Setting an EVAL scope, savestack=9 0 <> <this_is_brok> | 1: OPEN1 0 <> <this_is_brok> | 3: MINMOD 0 <> <this_is_brok> | 4: STAR Setting an EVAL scope, savestack=9 0 <> <this_is_brok> | 6: CLOSE1 0 <> <this_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 1 <t> <his_is_brok> | 6: CLOSE1 1 <t> <his_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 2 <th> <is_is_brok> | 6: CLOSE1 2 <th> <is_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 3 <thi> <s_is_brok> | 6: CLOSE1 3 <thi> <s_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 4 <this> <_is_brok> | 6: CLOSE1 4 <this> <_is_brok> | 8: EXACT <_> 5 <this_> <is_brok> | 10: END Match successful! (this) Guessing start of match, REx `(\C*?)_' against `is_broken_'... Not at start... Match rejected by optimizer Freeing REx: `(\C*?)_'
      Let me know if you'd like more debugging info (its from the Cobalt setup I mentioned earlier)

      UPDATE:

      However, using /.{0,}?/s instead of /.*?/s seems to fix it for me...

      % perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /(.{0,}?)_/sg' (this) (is) (broken)
      UPDATE 2

      Both of japhy's updated suggestions seem to work for me... with and w/o the /s modifier...

      (Broken Base Case...) % perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /([\000-\377]*?)_ +/sg' (this) (is) (broken) % perl -le 'print "($1) " while "this_is_broken_" =~ /([\000-\377]*?)_ +/g' (this) (is) (broken) % perl -le 'print "($1) " while "this_is_broken_" =~ /((?:.|\n)*?)_/g' (this) (is) (broken) % perl -le 'print "($1) " while "this_is_broken_" =~ /((?:.|\n)*?)_/sg +' (this) (is) (broken)

      -Blake