perl-diddler has asked for the wisdom of the Perl Monks concerning the following question:

I've run into a perl 'feature' that doesn't seem to work I think it 'should'. I'd tend to favor calling it a bug, though, I suppose, it could be a design decision I don't understand, in which case maybe someone could help me understand why it was designed this way... Thank you for aid in clarity... Linda

I'm dealing with instances of the error: Variable "$Need_msg" will not stay shared at ./find_deps_from_errs.pl line 1211. Variable "$newpkg" will not stay shared at ./find_deps_from_errs.pl line 1216.

I turned on diagnostics, and believe I understand what it thinks I am trying to do, but what it thinks I am trying to do isn't what I am doing (maybe the diagnostics are misleading me in all this...bug anyway...)

I have a case where I'm using a subroutine within an enclosing block within another subroutine that is accessing a variable outside the "enclosing subroutine's enclosing block, but inside the outer subroutine.

My intention is that it be 'global' inside the outer subroutine, yet perl (according to the diagnostics message) seems to be trying to make the variable part of a closure -- even though it's not a variable that goes out of scope at the end of the inner subroutine -- so it shouldn't, as far as I can tell, be trying to make a closure out of it.

Here's the raw code of the *section* in question. I'm actually getting the error on two variables at this point.

Variables "$Need_msg" and "$newpkg" below.

Note -- I don't need to know how to 'fix' or get around the problem. This isn't that type of problem -- its a case of why doesn't it work the way that I think it should work (I'm coming from working code and playing with different ways of "restructuring" it.

The error message comes at places where the affected variables are used within the body of "sub do_pkg_need_from_Dist".

sub process_need_from_Dist ($$\@) { my ($Rpmcmd, $Distdb, $argsp)=@_; my ($need, $insted, $rpm_nvra) = @$argsp; unless (length $insted) { $insted=0 } my $rqng_pkg; return undef unless $rqng_pkg=Pkg->by_rpm_nvra($rpm_nvra); my $Need_msg = sub ($) { Inf($_[0] ." needed by " . ($insted ? "(installed) " : "") . $rqng_nvra . "\n"); }; my $newpkg; { my ($rq_nam, $rq_vr, $rq_op); sub do_pkg_need_from_Dist($$$$) { my ($Distdb, $rq_nam, $rq_vr, $rq_op)=@_; my ($rq_v, $rq_r); if ($rq_vr) { # see if rq_vr contains rq_v & _r my @args=split_vr($rq_vr); if (@args == 2) { ($rq_v,$rq_r)=@args } else { $rq_v=$rq_vr } $Need_msg->(sprintf "Pkg %s %s v=%s, r=%s", $rq_nam, $rq_op, $rq_v, sp($rq_r) ); } # need exact pkg case if ($rq_v && $rq_op eq '=') { # exact pkg-nv[r] search $newpkg=$Distdb-> find_n_v_r_to_pkg($rq_nam,$rq_v,$rq_ +r) or do {Serious "find_n_v_r_to_pkg failed\n";return + undef}; } else { # version free search...validate >= after return $Need_msg->("Pkg $rq_nam, any_ver") if !defined $rq_op +; $newpkg = $Distdb->find_rpmnam_to_pkg($rq_nam) or do { Serious "find_rpmnam_to_pkg failed\n"; return unde +f }; if ($rq_vr && cmp_vr_strs($newpkg->vr, $rq_vr) < 0) { Serious "Pkg nvr '",$newpkg->nvr,"' not >= '$rq_vr +'\n"; return undef; } return $newpkg; } } { my $need_re1 = qr{(\S+)\s(>?=)\s(\S+)}; my $need_re2 = qr{^[^\.]+$} ; ($rq_nam, $rq_op, $rq_vr) = $need=~/$need_re1/ or ($need=~/$need_re2/ and $rq_nam=$need); } if ($rq_nam) { $newpkg=do_pkg_need_from_Dist($Distdb, $rq_nam, $rq_vr, $rq_op); } elsif ($newpkg=$Distdb->find_rpm_owning_file_to_pkg($need)) +{ $newpkg->add_file($need); } else { Serious "find_rpm_owning_file_to_pkg failed\n"; return undef } } }

Replies are listed 'Best First'.
Re: Perl scoping not logical...?
by pc88mxer (Vicar) on Apr 27, 2008 at 16:17 UTC
    I think adding use diagnositics; will help you understand this issue, and you can read more about the issue at http://www.perl.com/pub/a/2002/05/07/mod_perl.html:
    (W) An inner (nested) named subroutine is referencing a lexical variable defined in an outer subroutine.

    When the inner subroutine is called, it will probably see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared.

    Furthermore, if the outer subroutine is anonymous and references a lexical variable outside itself, then the outer and inner subroutines will never share the given variable.

    This problem can usually be solved by making the inner subroutine anonymous, using the sub {} syntax. When inner anonymous subs that reference variables in outer subroutines are called or referenced, they are automatically rebound to the current values of such variables.

    In your case I think the problem is the definition of sub do_pkg_need_from_Dist. Does it really need to be defined as a closure within process_need_from_Dist?

    Update: Your use of the variable $new_pkg is very odd:

    my $new_pkg; { sub do_pkg_need_from_Dist { ... $new_pkg = ...; # sets $new_pkg from the outer scope ... return $new_pkg; } $new_pkg = do_pkg_need_from_Dist(...); # sets $new_pkg again??? }
    Declaring my $new_pkg; inside do_pkg_need_from_Dist should fix the error message about that variable.
      I wanted the closure to isolate the variables declared at the top of the closure, but it isn't really needed for the program nor the error -- i.e. removing the brace after "my $newpkg;" and before "my ($rq_nam, $rq_vr, $rq_op)" makes no difference in regards to the problem -- I get the same warning that perl isn't able to share the syntactically shared variables.

      Perhaps it's a limitation in the current perl or, at the least, a semi-bogus warning.

        The warning isn't bogus, and IMHO the correct action should be harsher: the compiler should halt when encountering this situation since you cannot use nested named subroutines to close over the outer variables in perl (or you can, but as you noticed it won't do anything remotely useful).

        Just use an un-named subroutine assigned to a variable instead.

        update: if you think about it for a bit, you might notice why using named nested named subroutines as closures isn't very useful anyway: it implies you're redefining the - global - inner subs at each call to the outer subs.

        I would structure your code this way:
        sub do_pkg_need_from_Dist { my ($Distdb, $rq_nam, $rq_vr, $rq_op, $Need_msg) = @_; my $newpkg; ... set $newpkg here ...; return $newpkg; } sub process_need_from_Dist { my $Need_msg = sub { ... }; my $newpkg; ... if ($rq_nam) { $newpkg = do_pkg_need_from_Dist(..., $Need_msg); } else { $newpkg = $Distdb->find_rpm_owning_file_to_pkg($need); if ($newpkg) { $newpkg->add(...); } else { Serious ...; } } }
Re: Perl scoping not logical...?
by Jenda (Abbot) on Apr 27, 2008 at 21:24 UTC

    I'm afraid my post will be full of don'ts.

    First, don't use subroutine prototypes! They are NOT what you seem to think they are. Prototypes are NOT designed to let you specify and test the number of parameters passed to a subroutine, they are designed to let you instruct the parser to parse the code in a different way, to emulate the behaviour of some builtins or add something that looks like statements instead of normal functions. Drop them! And especialy do drop them when declaring unamed subroutines, they can't ever mean anything there.

    Second, don't put one named subroutine inside another! If you need two subroutines to share a variable, you should do it like this:

    { my $shared; sub foo { ... $shared++; .. } sub bar { ... print "Foo called $shared times.\n"; } }
    In this case keep in mind that all, even recursive invocations of the subroutines share the same variable(s)!

    If you do want a subroutine that has access to the current invocation's lexical variables, you have to use an unnamed subroutine.

    Imagine this:

    sub foo { my $i = shift; return if $i <= 0; sub printFoo { print "The \$i=$i\n"; } printFoo(); foo($i-1); }
    Now, which $i should the printFoo access? Keep in mind that if you call foo(4), then the first $i gets set to 4, the foo(3) gets called, another $i is set to 3, foo(2) is called ... so at some point you have 5 different $i variables! Also, the named procedure, even though it was written inside the curlies of another subroutine is NOT local to that subroutine! So what $i do you want to use when I call printFoo() directly?

    BTW, why do you even bother declaring the do_pkg_need_from_Dist() subroutine if you call it just once? If you did declare it outside the process_need_from_Dist() I could understand that you want to simplify the code by extracting and naming one part of it, but since the body of the do_pkg_need_from_Dist() is inside the process_need_from_Dist(), this can't be the reason.

      ...You ask why declaring if I call once...it *IS* to simplify it.

      What I wanted (and have used in other languages) is lexically scoped, named subroutines. That' perl makes all named subs global is related to it coming from being an unstructured, 'batch' language with no subroutines, and only global variables. We have lexically named variables, but sub's starting from a more primitive place than variables did (vars at least existed, while subs didn't), aren't yet at the place where one can use lexically named subs.

      For *me*, (and this is for me), having the subroutine internal to the only subroutine that is using it makes it more clear. I know -- unquestionably, that the routines are intimately related and shouldn't be considered for separate use unless they are further refined. At a glance, I can see what routines are only used by 1 routine.

      Normally when one makes a change in a large program, you need to be sure that the changes you are making to a subroutine won't cause some other place in the code to break because they had different expectations about what the sub was to return or "do". I don't do it as often in perl (because I have to go through the var->anonsub(args) ) and don't normally have the luxury of so finally tuning code at that level.

      But you can't argue that that local-subs aren't more clear unless you also want to argue that locally declared variables are not better (for the same clarity reason, among others) than global variables.

      I've used (and use) the structure you mentioned about having multiple subs in outer layers of braces. But in some instances, (like this one), the subroutine came from inline in the same routine -- and wasn't envisioned as a "co-routine", but a sub-function that I desire to be both named and treated as a lexical variable. As I replied elsewhere, above, something like a my sub foobar() {....}; that has the same lexical rules as variables in the same scope. So...yeah... I currently *can't* have lexical named subs, but that doesn't mean that this is a "good" thing. :-) It is possible for perl5 to evolve beyond where it's at now isn't it?

      As for the types -- I think they catch my typing errors when I'm writing my code before the typing information is stripped off and thrown away by making the calls indirect. And I rely on them altering the code in some places (though not in the code snipplet posted). But it's hard to write a "push" operator that doesn't use "\@" as its first parameters. Beyond that, even though I know the types aren't used on funcs I create as objects, I find the prototypes useful, at least during development, as documentation.

      In fact I find it quite annoying that there isn't an _option_ for stronger typing -- in the same way that many people would be upset if the "-w" flag and "use strict" were no longer around. But that's just me -- and not every program nor all the time...but sometimes it's annoying.

      It's inconvenient that the types that exist are so disparaged that common modules like "Exporter" don't even work right with typed modules without some special handling. So of course types become more of a problem because no one uses them or tests to see that their code could work with them. So that types "cause problems" becomes a self-created and self-fulfilling prophecy. Oh well... :-/

        But you can't argue that that local-subs aren't more clear unless you also want to argue that locally declared variables are not better (for the same clarity reason, among others) than global variables.

        Sure you can, unless you're in the habit of writing to global subs. Global state isn't the problem. Mutable global state is the problem.

        Localy declared subroutines might make the code clearer if they were visualy separated from the rest of the code of the subroutine and if they were used more than once within the subroutine.

        Otherwise they just confuse the flow.

        If you need another variable scope, a block is enough. In Perl lexical (my) variables are block scoped, not subroutine scoped. And if you want to "name" the block, use a comment.

        Subroutine prototypes are NOT subroutine types! Please read When to use Prototypes? and Far More Than Everything You've Ever Wanted to Know about Prototypes in Perl.

        I understand the desire to test types of variables and subroutines at compile time, but it's not possible in Perl5. Mostly.

Re: Perl scoping not logical...?
by jettero (Monsignor) on Apr 27, 2008 at 16:38 UTC
    I think the problem is that your named sub is only compiled once when the surrounding sub is compiled (or there abouts).

    What you want is a sub that's compiled each time throught he loop (so it uses the lexicals from this run instead of the first run.

    sub test { my $global_lexical_sorta = 0; my $other = 0; my $local_sub = sub { $global_lexical_sorta ++ }; sub named { $other ++ } $local_sub->() for 1 .. 10; &named for 1 .. 10; print "$global_lexical_sorta $other\n"; } &test; &test; # you get 10 10 -- $gls and $o are each incremented to 10 # and 10 0 -- the second $gls is incremented to 10, but ... # the first $o is incremented to 20 while the second $o is not increme +nted!

    You also seem to be crushing some of your locals... You have my ($rq_nam, $rq_vr, $rq_op);, but also re-my them where you set them equal to @_. That could easily be the actual problem. Not sure.

    -Paul

      Well, the problem is that I am really wanting a lexical named sub but currently, subs are still in the stage of being global to the entire program (variables used to be like that too, but then there used to not be subs, so subs started lower...that's why they're 'sub's. ;-}

      I certainly don't want the routine *compiled* each time I invoke the routine. I want it to use the same lexical frame as the variables that the sub is defined in. The only way to do it in perl is to use anonymous subroutines and put a reference into a variable and use an indirect call $var_func->(args), which is not as visually as clear a syntax (in my mind) as "func(args)".

      The locals aren't really being crushed...their just in flux....(duplicated, really)....

      I took a large chunk of 'inline' code that seemed like it should have its own function and shoved it into a subroutine and started by passing in all of the variables that were needed -- initially about 6 args, with about 3 that were only used inside the code block (so they could become local vars in the subroutine).

      Wasn't ideal, but that's why I began shifting and playing with how the code was organized and some params were converted, while some (in the example posted) were still being passed as params...but I was changing them slowly as I was tracking down references in the code and seeing how long the variables needed to be "active".

      But that's also how I ended up with a named subroutine -- the subroutine was named for the "functional work" that it was doing. In cleaning/refactoring code, I try to go through and look at parts that are either not clear or are too long or just don't look right. One step in that for me in my own code is "saying what the code does" -- and making that a function-name and moving the code block into the function.

      Usually, I'm refactoring to get commonalities out of a code segment so I can reduce code size and have several places use the same code -- in which case, the 'sub' is usually outside of the sub I'm in. But sometimes, I'm simply refactoring because I'm "digesting"...trying to break the code down into smaller more portable bites or pieces, that I can more easily glance at and tell if they are working or not and easily understand them. It could be similar to going through my writing and trying to eliminate the wordiness (I tend toward overwordiness, though I'm sure no one has noticed; :-)) or break apart larger sentences into smaller ones. Occasionally, I can run into a the equivalent of a run-on sentence that takes up an entire paragraph that would be more clear if broken up. If its code I have some time to clean up, I'm usually much happier with it and I'm more likely to be able to pick it up and work with it again in 6-12 months time -- vs. "stream-of-consciousness" blitz filters or scripts to do a task that I want or need to get done.

      It's like when I'm thinking about the task or work that needs to be done, it's hard to think about how to also make the instructions of how to do that crystal clear to a third party -- they are different foci, which is why I usually like to work in "passes" -- getting something working so I can begin 'testing' my grasp about what I'm doing... Writing the program is a great way to really flesh out a problem -- since it has to be clearly explained at least well enough that the computer can understand what you meant. Converting that to a readable form makes it into a releasable program.

      Does that make sense?

        Does that make sense?

        Not really.

        I certainly don't want the routine *compiled* each time I invoke the routine. I want it to use the same lexical frame as the variables that the sub is defined in.

        You can either pass them in to the sub, or you can use lambdas. In perl, today, you can't have it both ways. Even if you had named lexical subs, they'd still have to be recompiled each time (minus compiler optimizations, which may exist for function ref lambdas). Anyway, I think function refs are perfectly clear. They're very widely used and when you need a lambda, they're the appropriate way to do it.

        Here's just a couple of the common uses. You can think of millions more I'm sure. Don't shy away from function refs. They're quite clear.

        my $awesome = awesome(); print "still tickin'\n" while $awesome->(); sub awesome { my @a = (1 .. 10); return sub { shift @a }; } my %code_table = ( action1 => sub {print "doin' it\n"}, ); $code_table{action1}->(); # also awesome

        People really expect the subs to work they way they do now though, so it's not likely to be changed until perl6 at the earliest. Imagine how much code would break if they suddenly worked differently?

        The locals aren't really being crushed...their just in flux....(duplicated, really)....

        Oh, I get it. I'm just saying that...

        my ($c,$d); my ($c,d) = (4,5); # ...maybe we can live without the fi +rst my?

        Essentially, I was asking if that was unintentional and might be causing your troubles.

        -Paul

Re: Perl scoping not logical...?
by shmem (Chancellor) on Apr 27, 2008 at 20:52 UTC

    Lexicals are cleared at subroutine return, and allocated anew at the next invocation. It seems that there's also a transition between compilation and execution. What is hidden in <readmore> tags might add to the bewilderment and incite further investigation.

    Nested subs is a dragon area (look for Abigail there)... ;-)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Perl scoping not logical...?
by ww (Archbishop) on Apr 28, 2008 at 00:30 UTC
Use: my $var = sub { };
by Anonymous Monk on Apr 28, 2008 at 05:04 UTC
    Just replace
      sub do_pkg_need_from_Dist($$$$) {
    
    with
      my $do_pkg_need_from_Dist = sub {
    
    and then replace the calls to
      do_pkg_need_from_Dist()
    
    with
      $do_pkg_need_from_Dist->()
    
    and then everything will work out as you expect.
Re: Perl scoping not logical...?
by John M. Dlugosz (Monsignor) on Apr 29, 2008 at 22:03 UTC
    Hmm, without going through it in detail, I think that Perl 5 doesn't support "closure cloning" like Perl 6. The closure is only made once, but you keep re-creating a variable closed over.

      I'm not sure that's entirely accurate, but I'm not certain I understand exactly what you mean.

      All Perl subroutines -- named or not -- get compiled during compile time into optrees accessed through an internal data structures called a CV. Named subroutines get stored in global symbol tables: hashes where the keys are the names of the subroutines and the values are CVs. Anonymous subroutines are stored elsewise (and I'm not certain enough of how to explain it well).

      CVs have arrays of lexpads attached. Lexpads store lexicals. There are multiple lexpads because any particular point in a program can have called through the same subroutine multiple times, and because lexical scopes nest.

      When you create a new closure at runtime, Perl doesn't recompile the code and create a new CV. It reuses the anonymous subroutine's CV, and attaches the current innermost lexpad. (In detail, there's probably a cloning operation here, but I don't want to get too much into the details.)

      This should be enough to explain why lexical variables won't stay shared.

        I think the point is that a named subroutine doesn't get the full closure treatment. It never "creates a new one at runtime" (called cloning in the Perl 6 synopses).

      Yeah...not my intent. I was trying for a non-closure, lexically scoped, named-function, but got stuck with the "if named(sub) then must be global", limitation in perl5 (maybe perl6 allows lexically scoped, *named* subs?)


      -linda

        Ah yes, writing a named sub in Perl 5 does not interact in the expected way with the context.

        In Perl 6, you can nest subs, no problem. They find the proper variable in their active caller, if you refer to one.