in reply to Use of uninitialized variables?

Hey folks, co-worker here. :) Here's a particular, extremely simplified example of what I'm talking about. The example uses scalars, but applies equally to hashes and arrays. I'm specifically talking about lexical package variables, usually used in the case of inside-out classes. We start by setting up a few variables, in this case, scalars, and then initialize them in various places. We're using undef, but it could be other values. Then, a user of the module or class creates/initializes these values in a begin block of their own, and we see what happens.
use strict; use warnings; my $stuff1; my $stuff2 = undef; my $stuff3; my $stuff4; BEGIN { $stuff3 = undef; } INIT { $stuff4 = undef; } BEGIN { # hypothetical initialization, creation going on. $stuff1 = 'stuff1 stuff'; $stuff2 = 'stuff2 stuff'; $stuff3 = 'stuff3 stuff'; $stuff4 = 'stuff4 stuff'; } print "stuff 1: $stuff1\n"; print "stuff 2: $stuff2\n"; print "stuff 3: $stuff3\n"; print "stuff 4: $stuff4\n";
And the results are:
stuff 1: stuff1 stuff
Use of uninitialized value in concatenation (.) or string at beginproblem1.pl line 47.
stuff 2:
stuff 3: stuff3 stuff
Use of uninitialized value in concatenation (.) or string at beginproblem1.pl line 49.
stuff 4:
By assigning 'empty' values to these internal package variables, you end up overriding them if your package or class is used in a BEGIN or INIT block. We're using perl v5.8.8.

Replies are listed 'Best First'.
Re^2: Use of uninitialized variables?
by ikegami (Patriarch) on Jun 12, 2008 at 09:07 UTC

    First, let's get the incorrect terminology out of the way. Lexical and package variables are two different kinds of variables, and they are mutually exclusive. There's no such thing as lexical package variables.

    >perl -le"package PkgA; my $foo = 'abc'; package PkgB; print $foo" abc

    I usually use the term global variable for what you call lexical package variables. It's not perfect, but it's not outright wrong.

    Now back to the subject. Why would you initialize a variable twice? That's bad, without or without BEGIN. If I saw code that looked like the following, I'd have a talk with you. And yet, that's exactly what you are doing.

    my $stuff1; my $stuff2 = undef; my $stuff3 = undef; my $stuff4 = undef; $stuff1 = 'stuff1 stuff'; $stuff2 = 'stuff2 stuff'; $stuff3 = 'stuff3 stuff'; $stuff4 = 'stuff4 stuff'; print "stuff 1: $stuff1\n"; print "stuff 2: $stuff2\n"; print "stuff 3: $stuff3\n"; print "stuff 4: $stuff4\n";

    You code without BEGINs should be

    my $stuff1 = 'stuff1 stuff'; my $stuff2 = 'stuff2 stuff'; my $stuff3 = 'stuff3 stuff'; my $stuff4 = 'stuff4 stuff'; print "stuff 1: $stuff1\n"; print "stuff 2: $stuff2\n"; print "stuff 3: $stuff3\n"; print "stuff 4: $stuff4\n";

    So your code should be

    my $stuff1; my $stuff2; my $stuff3; BEGIN { $stuff1 = 'stuff1 stuff'; $stuff2 = 'stuff2 stuff'; $stuff3 = 'stuff3 stuff'; } my $stuff4; INIT { $stuff4 = 'stuff4 stuff'; } print "stuff 1: $stuff1\n"; print "stuff 2: $stuff2\n"; print "stuff 3: $stuff3\n"; print "stuff 4: $stuff4\n";

    Of ir you had anything complex, an adherent to the quoted practice would do

    my $stuff; BEGIN { $stuff = undef; ... ... ... ... Some complex code to initialize $stuff. ... ... ... }

    instead of

    my $stuff; BEGIN { ... ... ... ... Some complex code to initialize $stuff. ... ... ... }
      Wups, mybad on the terminology. I didn't know what to call them but global doesn't seem right either. *shrug* As far as these variables go, my limits the scope in such a way that in a different module the variable isn't available. Try separating the packages into their own modules, like in this example:

      Mytest.pm:

      package Mytest; use strict; use warnings; my $var = 1; 1;
      And then a user of the module:
      use strict; use warnings; use Mytest; print $Mytest::var . "\n";
      You'll get this:
      Name "Mytest::var" used only once: possible typo at usemytest.pl line 5.
      Use of uninitialized value in concatenation (.) or string at usemytest.pl line 5.
      
      Or this:
      use strict; use warnings; use Mytest; print $var;
      You'll get this:
      Global symbol "$var" requires explicit package name at usemytest.pl line 5.
      Execution of usemytest.pl aborted due to compilation errors.
      
      Without strict and warnings:
      use Mytest; print $var;
      or
      use Mytest; print $Mytest::var;
      You get nothing; $var is completely invisible outside it's module...or am I missing something? :/ Is there another way to access that scalar that I'm missing?

      As far as the example goes, it is an overly simplified demonstration of some interactions between many modules in a medium sized tool I'm working on. Here's a slightly more concrete usage, and two instances where the issue happens. p1 shows the same kind of issue as before: by assigning $undef to the variable on the same line it's created, if used later in a begin block, then the undef overrides anything done during the begin block. p2 shows the issue in a much more real world scenario: you have a simple inside out class, and decide to 'initialize' the "private variable" hashes. For that example, I'm trying it three different ways: initializing in line, using the default 'my' behavior, and initializing in a begin block, to show what happens in each of these scenerios:

      #!/usr/bin/perl -w package Problem; use strict; use warnings; use Scalar::Util qw(refaddr); { my %key1_of; # leave blank test my %key2_of = (); # initialize test my %key3_of; BEGIN { %key3_of = (); # initialize with a BEGIN } sub new { my ($class, $ar1, $ar2, $ar3) = @_; my $new_object = bless \do{my $anon_scalar}, $class; $key1_of{refaddr($new_object)} = $ar1; $key2_of{refaddr($new_object)} = $ar2; $key3_of{refaddr($new_object)} = $ar3; return $new_object; } sub get_key1 { my $self = shift; return $key1_of{refaddr($self)}; } sub get_key2 { my $self = shift; return $key2_of{refaddr($self)}; } sub get_key3 { my $self = shift; return $key3_of{refaddr($self)}; } } 1; # Here's a user of the above object. package Main; use strict; use warnings; # Below: Creating values for the sake of showing the issue at hand... # First variable: assigning undef (declaring its initial value), # but due to a subroutine or other complex series of processing, it # gets assigned a value during a BEGIN block my $p1 = undef; # using the 'always assign a value' paradigm here my $p2; # leaving it blank for now my $p3 = undef; # same 'always assign' paradigm here, but not used i +n 'begin' # some processing happens in here; for example, let's say these variab +les are # dependent on user input or automatic config files, so $p1, $p2, $p3 +might # not ever be used. It just turns out that this time, they all are. # However, the user of the package creates these instances of the obje +ct in a # begin block, and for whatever reason, the user can't avoid doing thi +s. BEGIN { $p1 = Problem->new('P11', 'P12', 'P13'); $p2 = Problem->new('P21', 'P22', 'P23'); } $p3 = Problem->new('P31', 'P32', 'P33'); sub printit # for display purposes { my $p = shift; if (defined $p) { print " k1: " . $p->get_key1() . "\n"; my $k2 = $p->get_key2(); $k2 = "(undef)" if ! defined $k2; print " k2: $k2\n"; print " k3: " . $p->get_key3() . "\n"; } else { print " Error: Object is undefined.\n"; } }; print "P1:\n"; printit($p1); print " (Due to scoping of the undef assignment, the object itself i +s lost.)\n"; print "P2:\n"; printit($p2); print " (Note loss of key 2's value; it's lost because %key2_of\n" . " is assigned to () after the object's creation).\n"; print "P3:\n"; printit($p3); print " (As expected, no issues at all).\n";
      And then, the output is:
      P1:
         Error: Object is undefined.
         (Due to scoping of the undef assignment, the object itself is lost.)
      P2:
         k1: P21
      
         k2: (undef)
         k3: P23
         (Note loss of key 2's value; it's lost because %key2_of
          is assigned to () after the object's creation).
      P3:
         k1: P31
         k2: P32
         k3: P33
         (As expected, no issues at all).
      
      As you can see, p1 is completely wiped out by the initial assignment and p2 loses it's member variable due to the assignment of () to it after the object is created in the BEGIN block. Setting up the variable in a begin block ie, initializing it to 'empty', doesn't cause any problems for users, no matter where they use it.

      The issue here is if we're initializing a value, and the initial value is empty or undef, why do it twice? It simply makes the initialization process even more complex. If we're not using it or not using it right away, we might as well take advantage of the behavior for perl to automatically setting scalars to undef or arrays/hashes to empty.

      Or am I just completely off about all this?

        Your initial series of examples essentially just demonstrate the difference between lexical variables and package variables that have been brought up in the thread already.

        The larger example is more interesting and certainly makes a more coherent case against initialization. Thanks.

        As far as these variables go, my limits the scope in such a way that in a different module the variable isn't available.

        No one's disputing that.

        Here's a slightly more concrete usage, and two instances where the issue happens

        You're creating instances of a class before the class's module has been executed, and you're blaming the module? Change

        ... package Problem; { ... 1; } package Main; ...

        to

        ... BEGIN { package Problem; ... } ...

        or move the code to a used file. use adds the BEGIN for you.

        (By the way, the "1;" is useless, and you misspelled "package main;". And note how "package main;" is no longer needed when you move the package statement into the curlies.)

        As for the second issue,

        my $p1 = undef; BEGIN { $p1 = Problem->new('P11', 'P12', 'P13'); }

        I've already stated I think you're obviously doing something wrong if you're initializing something twice.

        First off, there is a difference between a package variable and a global variable. A package variable is accessible from outside the package, you just have to completely qualify the variable name. A global variable just means a variable visible everywhere within the current scope ... which means a variable declared at the start of a file (or package) is visible throughout that file.

        Here's an example:

        #!/usr/bin/env perl use 5.010_000; use strict; use warnings; { package MyTest; my $global = 'global variable'; our $package = 'package variable'; sub access_global { return $global } 1; } { package main; my $package_var = $MyTest::package; my $global_var = MyTest::access_global(); my $warning_var = eval { $MyTest::global }; say ">$_<" for $package_var, $global_var, $warning_var; } __END__ >package variable< >global variable< >< Name "MyTest::global" used only once: possible typo at /Users/io1/Work +space/example/package_variables.pl line 16. Use of uninitialized value $_ in concatenation (.) or string at /Users +/io1/Workspace/example/package_variables.pl line 18.

        To get at a variable that is global within a package "MyTest" from outside "MyTest" you have to use a subroutine access_global()

        As for the rest of your example, I suggest you use Class::Std to implement your inside-out class. There is a BUILD() method that you can use to initialize variables in non-standard ways.

        Hope that helps =)


        Smoothie, smoothie, hundre prosent naturlig!