Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

In the course of writing a bunch of perl code for my firm, I noticed that some subs were used over an over and some variables were most naturally implemented as global variables.

After some experimentation I hit upon a method of making this happen. Let "program.pl" be one of the perl programs which needs to use (a) one of these common subs and (b) needs to share some variables with this sub.

Context: This is in Windows 10 (ugh) using Strawberry perl v5.20.2.

I construct a file "includee.pm" which looks like this:

use vars qw { $SHAREDVARIABLE1 @SHAREDVARIABLE2 %SHAREDVARIABLE3 ... } sub function { ... }
Then in program.pl I put at the top this line:

use includee
Both program.pl and includee.pm are in the same folder, which is also the current folder when the program runs.

This actually produces the desired effect: the shared variables are accessible in both progam.pl and includee.pm, and the sub function is callable from program.pl.

The question is: is this legitimate? Am I depending on a quirk, which might go away in a future version of perl? In particular, will this method work in a linux context also?

Replies are listed 'Best First'.
Re: Common subs and Global Variables
by eyepopslikeamosquito (Archbishop) on Mar 18, 2024 at 21:46 UTC

    I noticed that some subs were used over an over and some variables were most naturally implemented as global variables

    Generally, I would advise you to minimize the use of global variables and especially to avoid keeping state in global variables.

    Is it feasible to refactor your code into cohesive modules (with each module having unit tests) and write short, easy to understand mainline scripts that call these modules?

    👁️🍾👍🦟
      I have used global variables in one form or another for years, and I have been constantly admonished that it is a bad thing to do. Could someone please provide a cogent argument why? I admit that it takes some care to avoid name clashes and other silly problems, but aside from such administrative considerations, what is the issue?

        The biggest issue is "action at a distance". That is, it is much harder to ensure correct management of global variables because the code that may affect them may be scattered through the code base. That makes it hard to think about how the various places that may alter or use a global variable interact.

        For trivial code it doesn't matter at all. But the use of global variables don't scale well as code size increases.

        A significant exception to the "don't use globals" guideline is using global constants. Because they are constant there is no issue with figuring out how their value may change (it doesn't - they are constants!). Using global constants to manage things like scaling coefficients or other magic values is vastly better than using manifest constants.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

        I have used global variables in one form or another for years, and I have been constantly admonished that it is a bad thing to do. Could someone please provide a cogent argument why?

        Surprised I don't have a list of references on this topic yet. Here's a start. Other cool references welcome.

        References

        From On Coding Standards and Code Reviews:

        • Information hiding: Minimize the exposure of implementation details; provide stable interfaces to protect the remainder of the program from the details of the implementation (which are likely to change). Don't just provide full access to the data used in the implementation. Minimize the use of global data. Avoid Action at a distance.

        From On Interfaces and APIs:

        • Before lexical file handles were introduced in Perl 5.6, those evil old global file handles spawned a host of unfortunate idioms, such as: select((select($fh), $|=1)[0]). In terms of a clear and simple interface to perform a simple task (set autoflush on a file handle), it doesn't get much worse than that, a compelling illustration of why keeping state in global variables is just plain evil. Thankfully, nowadays you can replace it with: $fh->autoflush().

        👁️🍾👍🦟

        Encapsulation, the same reason we use my variables.

        Limiting the scope of variables has a number of benefits. Primarily, that name clashes less likely, and that it's easier to find users of the variable. Don't underestimate the value of the latter; it's huge. Conversely, using variables as part of the module's interface creates a very rigid interface.

        As code grows more complex over time (and into multiple modules), it is easy overwrite a global variable when you re-use the name by accident. And it can be hard to trace any problem, because you just can't put a debug print/stacktrace in the ONE function that is able to modify the variable.

        Sometimes a couple of global variables might be the right choice, but more often than not they are a liability.

        For me, the big exception to the rule is when i write code for microcontrollers. When you only have like 1000-4000 bytes of RAM (if you are lucky), ditching modern OO and basically laying out the memory map by hand is sometimes the best (and only) option. It's astonishing how much you can achieve with a kilobyte of memory when you spend a year shuffling bits and bytes around in your memory layout...

        PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

        The "administrative considerations" remind me of two quotes:

        ... but other than that, how was the play Mrs. Lincoln?
        ... but other than that, how was the parade Mrs. Kennedy?

        If you can avoid the administrative parts without needing to compromise on code functionality, why wouldn't you?

        I want to emphasize that global *settings* are not really bad (things acting like environment), but global *variables* (meaning that they change to hold different data as the program runs) usually are.

        I will further qualify that in Perl specifically, you have the 'local' keyword which can change a global variable to affect only things within that scope, and then it goes back to its original setting at the end of that block. This pattern makes global variables much less messy than in any other language, and I use this pattern in a variety of sticky situations where normal function parameters would be awkward or inefficient.

        But, the point still remains that code like this is abjectly terrible:

        # This initializes $main::foo function_one(); # This uses the value of $main::foo and updates it function_two(); # This relies on the modified value from function_two function_three();
        Why is it so terrible? Because when the maintenance programmer comes along and tries to implement some new feature or fix some bug, they might cause function_two to not get run, and now function_three is broken. The comments I added above are typically not present, and they waste tens of hours hunting high and low through all the code to find out why it broke, and what part of their change caused it.

        Compare with:

        my $value= function_one(); function_two(\$value); function_three($value);

        Without even having any comments, you can see that $value comes from function_one, that function_two reads and maybe writes it, and that it is used by function_three. These clues dramatically speed up development and debugging. As another benefit, you can refactor this into an object where the _one, _two, _three workflow occurs during method calls. If $value is global, you can't refactor this easily, or even realize that it will be hard to refactor.

        ”… what is the issue?”

        ”The principle is that a declaration in one part of the program shouldn’t drastically and invisibly alter the behavior of some other part of the program.” (Mark-Jason Dominus, Sins of Perl Revisited 1999)

        «The Crux of the Biscuit is the Apostrophe»

Re: Common subs and Global Variables
by LanX (Saint) on Mar 18, 2024 at 20:12 UTC
    The recommended clean way to do this is to export variables and functions, see import and Exporter

    What you're doing works in Linux and won't go away.

    But once someone wants to introduce new namespaces via package or have better control which vars and subs are included, you'll likely run into problems.

    Edit

    PS: FWIW what you're doing is similar to "sourcing" in shell languages, and could already be done with do FILE instead of use ...

    ...and frankly I'm surprised you don't need a use lib "."; to allow modules in the local . dir

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      I'm surprised you don't need a use lib ".";

      The Anonymous OP said "This is in Windows 10 (ugh) using Strawberry perl v5.20.2." and I'm pretty sure . was only removed in a later version (5.26 maybe?).


      🦛

        5.26 in entirely correct: https://metacpan.org/release/XSAWYERX/perl-5.26.0/changes

        For security reasons, the current directory (".") is no longer included by default at the end of the module search path (@INC). This may have widespread implications for the building, testing and installing of modules, and for the execution of scripts. See the section "Removal of the current directory (".") from @INC" for the full details.

        PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
Re: Common subs and Global Variables
by bliako (Abbot) on Mar 19, 2024 at 11:24 UTC

    If what you are doing now stops working in the future for some reason, then you can set/get my variables, i.e. local to the includee module, via setter/getter subs.

    package includee; my $SHAREDVARIABLE1; sub sharedvariable1 { $SHAREDVARIABLE1 = $_[0] if defined $_[0]; $SHAREDVARIABLE1 } 1;

    HOWEVER, this IMO is not good style (at least! see 5' edit below). Although it is a notch better than loose shared variables (as you propose) as it encloses the variables inside the module and you control the access via a sub which, btw, can have complex control code to limit access or, at least, log who access what via a stacktrace.

    Edit after 5': there will be some edge cases which this has unexpected results. For example if another module which you include in "program.pl", includes "includee.pm" and initialises its variables. I am sure there will be more. So, not only bad style but bug-backdoor too (hey shouldn't that be "bugdoor"?). end edit.

    An optimisation to this would be a package of variables, a config package, which holds all variables in, say, a hash and you access them via their keys. I am sure there will be some modules in CPAN for this purpose. Such a config class would be cumbersome in Java or C++ where you can't have mixed-type containers at will (easily and without being termed a heretic at SO). But it works well with Perl (and other dynamically-typed languages). So take advantage of this Perl feature.

    The benefit of a shared-variables object is that you can pass just that object holding the state of your program into each sub which requires access to it without necessarily turning into OO.

    And from that stage, perhaps elevate your "program.pl" into an OO class with all shared variables encapsulated into the class with minimal refactoring.

    An added benefit of boxing all your loose shared variables into a single hash/object (OO or not) is that it makes saving and loading state (perhaps with Storable) so much more convenient, that's always a nice feature for long-running scripts.

    I am a proponent of simple object-oriented programming, i.e. not to the extent of Java's maniacal bureaucracy, I like what Perl offers right now (still can't say about the "new" OO in Perl), because data encapsulation makes programming complex tasks easier, with less mistakes and bugs. Take advantage of that.

Re: Common subs and Global Variables
by ikegami (Patriarch) on Mar 19, 2024 at 20:15 UTC

    No, that isn't legitimate. A used/required file should always have a package that matches its name. You could export said vars, though.

    package includee; use strict; use warnings; use Exporter qw( import ); # I prefer to use `@EXPORT_OK` # and list the imported things # in the `use` statement. our @EXPORT = qw( $SHAREDVARIABLE1 @SHAREDVARIABLE2 %SHAREDVARIABLE3 function ); # Using `use vars` is ok, but dated. our $SHAREDVARIABLE1; our @SHAREDVARIABLE2; our %SHAREDVARIABLE3; sub function { ... } 1; # Modules must end with a true value.

    Note that it's usually better to export subs that manipulate the variables rather than exporting the variables themselves. It provides better encapsulation, which has a large number of benefits.


    If you want to load a module that's in the same dir as the script, you should add the following to the script:

    use FindBin qw( $RealBin ); use lib $RealBin;
Re: Common subs and Global Variables
by cavac (Prior) on Mar 21, 2024 at 17:21 UTC

    Am I depending on a quirk, which might go away in a future version of perl? In particular, will this method work in a linux context also?

    Your are using a Perl version that hasn't been updated or supported in 10 years. Many of us here are willing to discuss the general strategies of using global variables, modules, object orientation and so on. But when it comes to bugs and quirks in your ancient version of Perl (on an operating system that's nearly end-of-life), you are pretty much on your own.

    Personally, i refuse to support Perl versions or operating system that are end-of-life and don't receive security updates (with the explicit exceptions of Retro-Computing, historical exibits and emulators). I mean, who knows in this day and age. If i help you keep an insecure system running, i might be legally liable and your companies lawyers might come after me in the worst case if there's a data breach through a known security problem in your system. If you upgrade and you run into compatibility problems, that is another matter entirely. I'm pretty sure many of us here would be more than happy to help with specific problems, if you ask a proper, specific question: How do I post a question effectively?

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP