tadman has asked for the wisdom of the Perl Monks concerning the following question:

I've just stumbled into a curious subtlety:
#!/usr/bin/perl -w use strict; PutFoo(); my ($foo) = "Goo"; PutFoo(); sub PutFoo { print "Boo hoo, foo is $foo\n"; }
It would seem that any code inside of PutFoo() would have access to variables such as "$foo", but that does not seem to be the case. Perl skips over the definition of $foo, and runs PutFoo the first time without benefit of the initialization.

"-w" puts out a curious warning, "Use of uninitialized value in concatenation (.) at prog line 11." which doesn't seem to communicate much unless you knew to expect this sort of thing.

What strikes me as odd about this is that Perl splits the declaration and initial definition into two pieces. The symbol is declared, as undef, but the definition is run like a separate piece of code. This is quite unlike the behavior of C++, for example, where the declaration and assignment of initial value are one and the same, and one cannot occur without the other if they are specified together. Perhaps this is the price of a more dynamic language.

Is there a reason this is done, or is this a point of debate? Thanks for any tips.

Replies are listed 'Best First'.
(tye)Re: Perl Turning a Blind Eye
by tye (Sage) on Jan 30, 2001 at 02:50 UTC

    I always use a BEGIN block to initialize "static" variables:

    #!/usr/bin/perl -w use strict; PutFoo(); my $foo; BEGIN { $foo = "Goo"; } PutFoo(); sub PutFoo { print "Boo hoo, foo is $foo\n"; }

    As the declaration of $foo happens at compile time so I strongly feel that you should make the initialization also happen at compile time. This fixes your problem.

    What is happening is that Perl first runs through the file once (compile time) and compiles all of the code and executes key bits that are required at compile time. The key bits include BEGIN blocks, use statements, and declaration of lexical variables. Then the compiled code is executed (run time) which usually behaves like Perl running through the file a second time executing the bits it didn't execute the first time.

            - tye (but my friends call me "Tye")
      As you know, this is an area of disagreement between competent people. For instance (as we already discussed) I happen to disagree with you on this item...

        You'll note I used the word "I" twice in that reply. My solution fixes the problem. If you want to propose another solution, then please feel free.

        The original problem is a good example of the type of things I've seen people be burned by repeatedly that has led me to adopt the practice I've described. I'll let the original author make their own decision based on the facts provided. You have provided none here, except for the existance of your opinion. I guess the mere existance of your opinion is supposed to sway us? (: Or do you expect me to make your arguments for you? ;)

                - tye (but my friends call me "Tye")
Re: Perl Turning a Blind Eye
by ichimunki (Priest) on Jan 30, 2001 at 03:06 UTC
    I'm not sure I see the problem here. You haven't declared or assigned a value to $foo the first time you call PutFoo(). The language is following your statements in the order in which they appear. If you are going to use a global variable like that, you need to declare it before starting the block that calls subroutines that depend on it.

    Note: that's what tye's code does by subverting BEGIN. But that makes the code harder to follow. Debugging is a lot easier when you keep the parameters for a routine in the call to the routine, like PutFoo( $foo ), and then use shift (or @_) to get the argument(s) in sub PutFOO {}.
      I understand what Perl is doing, but it is a bit misleading to have the variable pseudo-defined, but not fully defined. For instance, Perl knows that there is a variable called $foo in the first call, the parser has picked it out, but it does not know what it is. If you change the reference in PutFoo() from "$foo" to "$bar", all sorts of warnings go off and the program will not run. "$bar" is a deal-breaker for compilation.

      The only reason I've even discovered this is because I wanted to have some "static" data for a function, but wanted to minimize the "commute" between editing the function and editing its associated data. What was irritating was that Perl insisted that my variable was "defined", but it did not contain any data. There is no warning for hashes or arrays unless you are printing them. 'strict', '-w', and 'taint' were all cool with my wacky, disordered definition. I wasn't.
      #!/usr/bin/perl -wT my ($global_vars....) = (global_var_definitions); # : Continues for some time sub func1 {} sub func2 { calls FuncN(), for example } # : many functions my (%funcN_data) = ( stuff ); sub funcN { reference to %funcN_data fails silently }
      The only solution is to carefully organize the order of the functions, or, uh, use the BEGIN{} solution proposed by tye, which I will have to agree, is a little out of the ordinary. I mean, it gets the job done, but at what price?

      I can only assume that the reason Perl behaves this way is that it is too difficult to implement differently, or because of consensus or preference on the part of the implementor, perhaps even because of the difficulty in implementing it. Sure would be nice to have a real error, though.
        I think your pseudo-defined complaint is more a matter of your having poor expectations, rather than Perl doing something unreasonable. What would your expectation be of this code?
        my $foo; print_foo(); $foo = "Hello\n"; print_foo(); sub print_foo {print $foo;}
        The first call shouldn't see the initialization because the straight-line code has a function call before my has been defined, and one after. Now should it matter where my is? How about these?
        my %data = default_data(); sub whatever { my $self = shift; # ... }
        When Perl first sees those it may not be able to even attempt to do the assignment.

        Perl uses a very simple rule. There is stuff that it takes care of when it parses and compiles the file, and there is stuff that it takes care of when it executes. Figuring out what function to call and what variable is which is part of parsing and compiling. Actually doing things with those variables (like assignment and function calls) are taken care of at executation, which means that they have available to them other things that have happened already while executing, and the definitions of everything in your code.

        What it also means is that function calls don't see assigments that haven't happened already. In other words Perl does actions in the order it sees them. Not very surprising at all when you think about it like that!

        Now for your problem there are two obvious solutions. One is to move the functions and private data into its own module. Now you load and initialize that data in one obvious step. The other is to memoize your data lazily like this:

        { my %private_stuff; sub some_func { unless (%private_stuff) { %private_stuff = qw(some defaults); } # ... } }
        This is something that tye and I have disagreed on for a bit. Personally I never use BEGIN in the way that he does, and I have yet to find a situation where I didn't find the above two solutions sufficient.
        Is %funcN_data something that needs to be declared and kept as information in the main block? Is another function besides funcN going to need that information? If so, it's global data, you should declare it right away. Otherwise, it can be made lexical to funcN by putting the  my (%funcN_Data) = (stuff); inside the sub {}'s. If the data in that hash is persistent, this looks like an excellent case for using package and bless-- and now you're into OO Perl, which will probably feel familiar to someone coming from a C++ background.