in reply to Re: Is your Perl broken? Bug help needed.
in thread Is your Perl broken? Bug help needed.

5.6.0 has far more bugs. And you can get around the bug by using \C instead of /./s (so long as you aren't working with Unicode). And if the bug has been fixed in bleadperl, it'll be fixed for 5.6.2.

Update: Hmm. Well then, try (?:.|\n) or [\000-\377]. Maybe?

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Replies are listed 'Best First'.
Re: Re: Re: Is your Perl broken? Bug help needed.
by blakem (Monsignor) on Jan 10, 2002 at 03:36 UTC
    While I wouldn't advocate moving back to 5.6.0, I don't think \C avoids this particular bug...
    % perl -le 'print "($1) " while "this_is_broken_" =~ /(\C*?)_/sg' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /(\C*?)_/g' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg' (this) % perl -mre=debug -le 'print "($1) " while "this_is_broken_" =~ /(\C*? +)_/sg' Freeing REx: `,' Compiling REx `(\C*?)_' size 10 first at 5 1: OPEN1(3) 3: MINMOD(4) 4: STAR(6) 5: SANY(0) 6: CLOSE1(8) 8: EXACT <_>(10) 10: END(0) floating `_' at 0..2147483647 (checking floating) anchored(SBOL) impli +cit minlen 1 Guessing start of match, REx `(\C*?)_' against `this_is_broken_'... Found floating substr `_' at offset 4... Guessed: match at offset 0 Matching REx `(\C*?)_' against `this_is_broken_' Setting an EVAL scope, savestack=9 0 <> <this_is_brok> | 1: OPEN1 0 <> <this_is_brok> | 3: MINMOD 0 <> <this_is_brok> | 4: STAR Setting an EVAL scope, savestack=9 0 <> <this_is_brok> | 6: CLOSE1 0 <> <this_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 1 <t> <his_is_brok> | 6: CLOSE1 1 <t> <his_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 2 <th> <is_is_brok> | 6: CLOSE1 2 <th> <is_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 3 <thi> <s_is_brok> | 6: CLOSE1 3 <thi> <s_is_brok> | 8: EXACT <_> failed... SANY can match 1 times out of 1... 4 <this> <_is_brok> | 6: CLOSE1 4 <this> <_is_brok> | 8: EXACT <_> 5 <this_> <is_brok> | 10: END Match successful! (this) Guessing start of match, REx `(\C*?)_' against `is_broken_'... Not at start... Match rejected by optimizer Freeing REx: `(\C*?)_'
    Let me know if you'd like more debugging info (its from the Cobalt setup I mentioned earlier)

    UPDATE:

    However, using /.{0,}?/s instead of /.*?/s seems to fix it for me...

    % perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /(.{0,}?)_/sg' (this) (is) (broken)
    UPDATE 2

    Both of japhy's updated suggestions seem to work for me... with and w/o the /s modifier...

    (Broken Base Case...) % perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg' (this) % perl -le 'print "($1) " while "this_is_broken_" =~ /([\000-\377]*?)_ +/sg' (this) (is) (broken) % perl -le 'print "($1) " while "this_is_broken_" =~ /([\000-\377]*?)_ +/g' (this) (is) (broken) % perl -le 'print "($1) " while "this_is_broken_" =~ /((?:.|\n)*?)_/g' (this) (is) (broken) % perl -le 'print "($1) " while "this_is_broken_" =~ /((?:.|\n)*?)_/sg +' (this) (is) (broken)

    -Blake