gregw has asked for the wisdom of the Perl Monks concerning the following question:

I just spent a full day tracking down an annoying scoping/namespace bug. Skipping all the extraneous and distracting details, it boiled down to code like this:
local $usefuldata = undef; ... $usefuldata = 1; # in real life, lots of logic here ... if (blah blah) { local $usefuldata = undef; ... $usefuldata = 2; # in real life, lots of logic here ... } insert into database $usefuldata
I was expecting the database to get '2' but it was getting '1' because I had repeated the initialization and definition of the variable in the inner scope. Shame on me. I kept scrutinizing all my $usefuldata-related logic which turned out not to be the problem at all.

OK, so I know why my bug happened. I even knew such bugs could happen. What I now want to know is, is there some useful trick or mechanism like 'use strict' or 'perl -w' that I could use to catch this class of bug more easily? Something that flags instances where I take a variable named my $x or local $x and create a new instance of $x in a different scope?

Replies are listed 'Best First'.
(Ovid) Re: Detecting scoping/namespace conflicts
by Ovid (Cardinal) on Apr 03, 2002 at 21:23 UTC

    gregw asked:

    What I now want to know is, is there some useful trick or mechanism like 'use strict' or 'perl -w' that I could use to catch this class of bug more easily?

    There are a couple of things you can do. First, local only works on package variables and the pseudo-global "special" variables such as $_ and friends. Since global variables make "action at a distance" a considerable problem, consider eliminating them from your code and stick to lexical variables declared with my (which are still package variables, they just default to the %main:: namespace).

    If, for some reason, this is not feasible, simply don't reuse the variable name. Unless you're doing some really funky stuff, there's no need for local unless you're dealing with Perl's built-in globals.

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      [...] consider eliminating them from your code and stick to lexical variables declared with my [...]

      But using my instead doesn't catch re-using of variables unless they're in the same scope, which everything inside gregw's if (blah blah) { ... } is not.

          --k.


      I didn't mean to re-use the variable name. What happened is that originally I was just using it in the inner scope, and then when I started realizing I needed it in the outer scope, I forgot to remove the initialization from the inner scope.

      And one clarification; probably a mistaken simplification on my part in my example. I was using $pkgname::usefuldata in my code, not just a plain $usefuldata... something I resorted to when trying to move a cron script I created into a web application.

      So really, to avoid this in the future, I have to write and keep rewriting my code to use 'my' all the time, rather than expecting some compile time or run-time checking to warn me. Is that basically right?

Re: Detecting scoping/namespace conflicts
by stephen (Priest) on Apr 03, 2002 at 22:10 UTC

    I'll start with a brief discussion of my and local. However, the best way of dealing with this situation is to pull out the $localdata assignment into a subroutine. See the end of this note for a discussion of that.

    Good: use 'my' not 'local'

    Use my not local. local is used primarily for making a local-scope copy of a global variable; for example:

    #!/usr/bin/perl use strict; sub foo { local $, = '... '; print ('a', 'list', 'of', "things\n"); } foo(); print "This", "is", "a", "test\n";
    prints out:
    a... list... of... things Thisisatest
    my, on the other hand, always creates a new lexically-scoped variable that is only visible in the enclosing block or file. Use my for all of the variables that you yourself create. So, to modify your code:
    use strict; # Transform 'local' to 'my', and get rid of needless # initialization to 'undef' my $usefuldata = 1; # in real life, lots of logic here if ( blah_blah() ) { # Eliminated second definition of $usefuldata $usefuldata = 2; # in real life, lots of logic here } insert_into_database($usefuldata)

    Note: code untested

    Better: Refactor To Subroutine

    Problem is, $usefuldata serves no purpose here except as a placeholder. Nobody is incrementing it; nobody is altering it. It's either 1 or 2. (I know this is a simplified case, but the point still holds no matter how complex the logic.) It's far better from your code's standpoint to pull out the code that calculates $usefuldata into its own subroutine.

    use strict; # Put all of the $usefuldata stuff in a subroutine sub get_useful_data { # Eliminated initial assignment of $usefuldata... # There's no need to calculate it if blah_blah() # is true and we're going to replace the value anyway if ( blah_blah() ) { # No need to store $usefuldata in a variable... # just return it return 2; } else { # Since blah_blah() is false, we return the default # case return 1; } } # Now we can calculate $usefuldata exactly where we want it insert_into_database( get_useful_data() )

    Note: Code untested

    An advantage to this is that you've eliminated $usefuldata entirely, and never need to wonder if somehow some other bit of code might have altered it. Everything having to do with $usefuldata is in one place.

    Gets off soapbox

    Update: Looking back, I'm not quite sure if you intended '1' to be the default case, or if the other $usefuldata was completely unrelated. This note assumes that you intended '1' to be inserted if your 'if' statement was false. In any case, replacing the variable assignment with a subroutine will solve the problem, because redefining a subroutine will cause a warning under '-w' anyway.

    stephen

Don't think so...
by RMGir (Prior) on Apr 04, 2002 at 13:06 UTC
    You're asking if there's a way to get notified if my or local hides an existing variable.

    Interesting idea...

    However, at least in the case of local, that's its main _purpose_ these days, doing things like local $_ or local $^W. You'd really need some kind of pragma that would let you specify a given variable as "global only".

    So you'd have to know ahead of time which variables you're likely to run into this problem with.

    I think if you have your program/module grasped that well, you're unlikely to run into the problem in the first place.

    Another reason not to do this is that things are hard enough to explain now, between use vars, our, my, and no strict 'vars'. I think adding another flavour might introduce more problems than it solves...
    --
    Mike