While I wouldn't advocate moving back to 5.6.0, I don't think
\C avoids this particular bug...
% perl -le 'print "($1) " while "this_is_broken_" =~ /(\C*?)_/sg'
(this)
% perl -le 'print "($1) " while "this_is_broken_" =~ /(\C*?)_/g'
(this)
% perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg'
(this)
% perl -mre=debug -le 'print "($1) " while "this_is_broken_" =~ /(\C*?
+)_/sg'
Freeing REx: `,'
Compiling REx `(\C*?)_'
size 10 first at 5
1: OPEN1(3)
3: MINMOD(4)
4: STAR(6)
5: SANY(0)
6: CLOSE1(8)
8: EXACT <_>(10)
10: END(0)
floating `_' at 0..2147483647 (checking floating) anchored(SBOL) impli
+cit minlen 1
Guessing start of match, REx `(\C*?)_' against `this_is_broken_'...
Found floating substr `_' at offset 4...
Guessed: match at offset 0
Matching REx `(\C*?)_' against `this_is_broken_'
Setting an EVAL scope, savestack=9
0 <> <this_is_brok> | 1: OPEN1
0 <> <this_is_brok> | 3: MINMOD
0 <> <this_is_brok> | 4: STAR
Setting an EVAL scope, savestack=9
0 <> <this_is_brok> | 6: CLOSE1
0 <> <this_is_brok> | 8: EXACT <_>
failed...
SANY can match 1 times out of 1...
1 <t> <his_is_brok> | 6: CLOSE1
1 <t> <his_is_brok> | 8: EXACT <_>
failed...
SANY can match 1 times out of 1...
2 <th> <is_is_brok> | 6: CLOSE1
2 <th> <is_is_brok> | 8: EXACT <_>
failed...
SANY can match 1 times out of 1...
3 <thi> <s_is_brok> | 6: CLOSE1
3 <thi> <s_is_brok> | 8: EXACT <_>
failed...
SANY can match 1 times out of 1...
4 <this> <_is_brok> | 6: CLOSE1
4 <this> <_is_brok> | 8: EXACT <_>
5 <this_> <is_brok> | 10: END
Match successful!
(this)
Guessing start of match, REx `(\C*?)_' against `is_broken_'...
Not at start...
Match rejected by optimizer
Freeing REx: `(\C*?)_'
Let me know if you'd like more debugging info (its from the Cobalt setup I mentioned earlier)
UPDATE:
However, using /.{0,}?/s instead of /.*?/s seems to fix it for me...
% perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg'
(this)
% perl -le 'print "($1) " while "this_is_broken_" =~ /(.{0,}?)_/sg'
(this)
(is)
(broken)
UPDATE 2
Both of japhy's updated suggestions seem to work for me... with and w/o the /s modifier...
(Broken Base Case...)
% perl -le 'print "($1) " while "this_is_broken_" =~ /(.*?)_/sg'
(this)
% perl -le 'print "($1) " while "this_is_broken_" =~ /([\000-\377]*?)_
+/sg'
(this)
(is)
(broken)
% perl -le 'print "($1) " while "this_is_broken_" =~ /([\000-\377]*?)_
+/g'
(this)
(is)
(broken)
% perl -le 'print "($1) " while "this_is_broken_" =~ /((?:.|\n)*?)_/g'
(this)
(is)
(broken)
% perl -le 'print "($1) " while "this_is_broken_" =~ /((?:.|\n)*?)_/sg
+'
(this)
(is)
(broken)
-Blake
|