in reply to Re^2: Perl scoping not logical...?
in thread Perl scoping not logical...?

The warning isn't bogus, and IMHO the correct action should be harsher: the compiler should halt when encountering this situation since you cannot use nested named subroutines to close over the outer variables in perl (or you can, but as you noticed it won't do anything remotely useful).

Just use an un-named subroutine assigned to a variable instead.

update: if you think about it for a bit, you might notice why using named nested named subroutines as closures isn't very useful anyway: it implies you're redefining the - global - inner subs at each call to the outer subs.

Replies are listed 'Best First'.
Re^4: Perl scoping not logical...?
by perl-diddler (Chaplain) on Apr 27, 2008 at 19:34 UTC
    Joost wrote: "... the correct action should be...the compiler ... halt when encountering this situation since you cannot use nested named subroutines to close over the outer variables in perl (or you can, but as you noticed it won't do anything remotely useful). Aye. I think we are seeing a similar problem, but coming to different conclusions about how to address the issue.

    I'm wanting to use certain "semantically-rich" features or behaviors that I'm familiar with, in Perl. As it has still has a strong root in its origin of being a simple scripting language, it doesn't currently support semantically(?syntactically?)-structured, non-global functions nor shared, "subroutine-private", but intra-subroutine, sharable variables between subroutines that share an enclosing scope, for the duration of the enclosed scope. Rather than taking the position that perl was never meant to do anything so complex, therefore, it should die on attempts to do such things, I tend toward enhancing the language, with useful constructs or behaviors that it could come to share with more traditional, compiled, high-level languages.

    While it is likely that either a property of "sub" or a different term (sublet? :-) ) could be used for sub routines that are lexically scoped in the same way "my"-labelled variables are. Perhaps it could be "so simple" as to use a syntax like: my sub foobar[(optionalproto)]{...}. (Dang, sorta liked sublets...).

    It's not like providing (allowing & supporting) "lexically-scoped" subroutines should be so considered so radical -- they are the norm in most languages. Perl, because of it's origins as a scripting language wasn't perceived to need such subtleties as lexical scoping when it was first developed, because it wasn't designed to be a general-purpose high-level compiler-like language. It didn't even have subroutines at one point. But... when it got them, it only implemented one global name space for subroutines that exists to this day. One can have syntactic division of subroutines within the global name space by prepending package-names, but it's not the same as lexically scoped subroutines.

    Don't you think it could be 'doable' -- at least, at first, within a simple case of subroutines being able to share local variables declared at the same level as the subroutines as is nearly the case here? I.e. the named subroutine is declared at the same level as variables it is accessing, with both being declared inside the same function.

    For a previous poster, I tried moving the nested subroutine definition out a level, -- removing the enclosing brackets -- making them at the same level. It seems like that level of nesting should be straight forward, no?

    Regarding your 'update' -- the fact that the nested subroutines are global isn't useful or desired behavior in this situation. I agree it is what the language has implemented at this point, but that doesn't mean a "my" keyword couldn't be added in front of a "sub" definition to simply allow those subroutines to only come into existence like their adjacent "my"-declared variables, no? It's a syntactic difference -- rather than forcing the user to use an anonymous subroutine with ref in a lexically scoped variable, wouldn't the code be more clear if I could use a named-lexically scoped subroutine name?

    I don't see that requiring the extra syntax of making the subroutine 'anonymous', and storing it in a variable, then calling it through a variable could possibly make its more clear for someone reading the code. Neither, IMPO, is putting the function outside of the area where it was intended to be used. That only invites the possibility that later, I or some maintainer might try to call the 'helper function' from a context it was not designed to support. By declaring it within the subroutine, it emphasizes that is a subroutine-context-dependent helper routine, no? :-)

    Do you think we could come up with a Perl "RFE" that could "swim"? :-)

    Linda

      It's not like providing (allowing & supporting) "lexically-scoped" subroutines should be so considered so radical -- they are the norm in most languages. Perl, because of it's origins as a scripting language wasn't perceived to need such subtleties as lexical scoping when it was first developed, because it wasn't designed to be a general-purpose high-level compiler-like language.

      Certainly not. I'm only aware of Pascal having lexically scoped/nestable subroutines, likely Modula has inherited them. Many other languages only have one, global, namespace for subroutines - certainly, Python, Ruby, PHP, JavaScript, Visual Basic, C and C++ have it that way.

      In Perl, what you want is written in the following idiom:

      sub foo { my $bar = sub { print "In bar\n"; }; $bar->(); };

      This idiom is frequently used and if somebody cannot read it, they should just learn Perl. You could change into a syntactically different flavour if you're willing to trade lexical for dynamic scope:

      sub foo { local *bar = sub { print "In bar\n"; }; bar(); };
        Hi Corion. You said "This idiom is frequently used and if somebody cannot read it, they should just learn Perl."

        Sort of a side issue, maybe, but does this imply you might not agree with Damion Conway's idea of not using perl's "unless/until" operators because they are not as familiar to non-Perl programmers who might follow or is it more due to the frequency with which you perceive the idiom to be used?

        If the idiom is frequently used (I've not seen it used that often, but perhaps its being frequently used because the feature is common more common in compiled languages more typically studied or taught in computer science and course work?

        Perhaps I've had a different exposure , but didn't most of those languages that don't support it (other than C or C++...C++ doesn't? ) come out *after* perl and aren't all of them (or weren't all of them) designed as script languages? I tended to think lack of that feature fell into more primitive languages (from a compiler language perspective). Even "C" was was initially used was barely more more than a high level assembler in earlier Unix days.

        I may be imagining or projecting things based on my limited exposure, but besides Pascal and Modula, I believe PL/M and it's predecessor PL/I and the majority of the few third generation computer languages that were developed. I don't remember if Ada had it or not. It seemed to be semantically rich. PL/M was more in use as a system language on Intel machines before C rose in popularity. I remember being saddened at the paucity of language semantics in 'C' when I started getting into more complex programs but I also seem to remember that languages that support Object oriented features like Classes/Packages and that supported multiple inheritance could be written with "pseudo" support for such -- since usually one doesn't need or use such features beyond 2nd' level functions, and many people seemed to employ Classes or Packages as 1st level procedures in languages that didn't have the better compiler support of HLL (High Level Language (implication being compiled, as they tended toward being based more on computer language theory than interpretive or scripting languages).

        I'd always hoped that the scripting languages would game more language richness over time. So much is focused on the accompanying language library these days (many common library functions being 'keywords' n Perl), that not much development is going toward language semantics. I hoped Perl6 might take Perl5 in that direction, but from what little I've seen of it, it seems to be becoming a language as different from perl5 as ruby or or python -- more of a revolutionary development than evolutionary, but that may be subjective...

        I'm only aware of Pascal having lexically scoped/nestable subroutines, likely Modula has inherited them. Many other languages only have one, global, namespace for subroutines - certainly, Python, Ruby, PHP, JavaScript, Visual Basic, C and C++ have it that way.

        If you group them into scripted(interpreted) languages vs. compiled, it seems scripting languages mostly don't, but above 2nd generation languages like Basic & Fortran, compiled languages are more likely to have it as a possibility.

        As for C++, it has been a while since I've programmed in it, but can't functions be marked as private or public? If they are private are they still in the global name space?

        But I believe PL/I, and Intel's system programming language prior to "C", PL/M had the feature. Dunno what other languages...so many have risen and died.

        As for the "idiom" being frequent...that doesn't mean it is easily readable. when I see "$var = ..." I think of it as an assignment (it is), not a function declaration. The function declaration ... I tried several variations, just isn't as readable as seeing the form:

        sub <description function name> [(arg-proto)] { body....indented...between a sub and a closing }.

        The best I've come up with so far for an attempt to make the subroutine name appear has been a 2-3 line-high form like:

        my $ print_Need = sub ($) { Inf($_[0] ." needed by " . ($insted ? "(installed) " : "") . $rpm_nvra . "\n") };

        But I feel like I might risk harm by such contorted structuring to make a local subroutine stand out with its message.

        Even with such contortions, I'm still not excessively happy with the verbose call syntax. Why does the language need "->" for correct meaning, when it would seem more clear (at least in this case) without it. I.e.:

        $print_Need("file foobar needs package xyzzy") #seems clearly to be a procedure call #(does it have another syntactic meaning I'm forgetting at the moment? +) #vs. $print_Need->("..."); #seems less clear
        I see the "->" and think "ah, we are doing an indirect function call to whatever is in $print_Need. I wonder what's in there. Is it an object?" vs. my intended interpretation of "oh, that's printing the Need passed in the arguments". I can't just read it and know what is going on.

        First, I have to go look and see what's assigned to "$print_need". Then I find the assignment "$print_need= sub {...}" and now know that the variable "$print_need" is holding a reference to a local, anonymous function. What does the function do? It's not obvious, as the function isn't named but is "anonymous".

        Just because a variable named "print_need" holds a reference to an anonymous function, I still don't immediately relate the name as being a "verb", descriptive of what the "sub" does, which would be more likely true if I saw: "sub print_need(...)" - *LED lightbulb turns on* - ahh. its a sub that is printing the "need" passed in parens.... At least that'd be my first impression...

        It could be querying or setting the "print_need" attribute on (args), but I'd lean toward the direct-order interpretation first.

        Obviously nothing prevents me doing these things in one of a hundred different ways in perl, but it's an intuitive, self-describing style that I'm searching for -- I really don't like comments unless there is much too much to explain...and if there is, I question whether or not the sub"routine" is "clear" enough....why isn't it clear "on its own"... It may be that the comment is necessary, but my first "shortcut" in writing code I want to be able to "re-use" is to make the code "legible" or "readable" (with trade-offs going to type-ability (a paragraph-long sub might clutter things up a bit, as well as make me not want to call the sub, as its name is too long :-) ). -Linda

      It's not like providing (allowing & supporting) "lexically-scoped" subroutines should be so considered so radical...
      It's not radical, and it's been discussed on and off by the perl implementors for years, and generally (I think) thought of as a good idea, but it just hasn't been done, probably because it's not considered to be worth the time, especially since you can already use lexically scoped anonymous subs for all situations you'd use lexical named subs, with only minor extra syntax, and you can assign closures to globs to redefine global subs, and neither require excessive amounts of code (one additional my statement + one deref per call, and one assignment, respectively). In the mean time, named subroutines in perl are global, and bitching about it isn't going to help.

      I don't see that requiring the extra syntax of making the subroutine 'anonymous', and storing it in a variable, then calling it through a variable could possibly make its more clear for someone reading the code.
      It does make it very clear you're calling a lexically scoped subroutine, instead of a global one. Not a big advantage, though, I grant you, but perl will never be Scheme. Perl isn't pretty in some ways, but it does provide (sometimes clunky, but reasonable) ways to use almost any programming technique you want (with the exceptions of macros, which are just too clunky in perl - see source filters - to be truly useful, and good multi-processor/multi-node support, which still seems to be in its infancy as far as programming languages I know go - though Erlang seems to be a good step).