I traced the problem down to the point of finding that it was the pp_unstack code at the end of the loop that is breaking the closureness of the regex block. Here's a minimal test case using the regex:
for my $str ( 1..3 ) { my $lex = ""; $str =~ m# (\d) (?{ $lex = $1; print " In: $lex\n" }) #x; print "Out: $lex\n"; }
But it turns out you don't even need a regex to trigger the bug. Here's the same thing with a regular closure:
for $i ( 1..3 ) { my $lex = ""; $code = sub { $lex = "Test $i"; print " In: $lex\n"; } unless $code; $code->(); print "Out: $lex\n"; }
The basic underlying problem is that when a lexical variable goes out of scope, the unstack code tries to "null out" any lexicals in its scope. But it can't do that if someone's trying to return the lexical as a result. In that case, it cuts it loose and cooks up a new lexical. Unfortunately, that isn't the case here, but it thinks it is, so it cuts it loose from the inner scope rather than from any outer scope.

Ordinarily closures don't run into this problem because they reclone their symbol table each time through the loop. But as you see in the last example, if we suppress the recloning with an unless, we get the same bug.

Actually, it's arguably not a bug in the second case, because we've taken a closure to the first time through the loop, and if the bindings refer to the my variables the first time through, they can't also refer to the same variables in the other iterations, presuming them to be "different" my variables.

So probably what needs to happen is that when the sv_compile_2op() routine is compiling the insides of (?{...}), it needs to allocate and save a real CV rather than just the opcodes, so that when the regex engine gets around to running the closure, it can actually be made a closure by calling cv_clone().

Oh, and my mail server is down right now, so if someone could forward this to perlbug, I'd be much obliged.


In reply to Re: regex code embedding problem? by TimToady
in thread regex code embedding problem? by perlguy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.