in reply to Re: Using guards for script execution?
in thread Using guards for script execution?

Thanks, I appreciate the response.

The suggestion to use main, a function, was to provide a namespace that was not package-level to prevent accidental use of "global" variables. In practice I have no idea how useful this is and how often that mistake occurs.

I agree that it seems strange to reference a default and implicit namespace, and in a similar manner, it's particularly verbose to have logic which ensures a script was run and not imported in small programs. I thought to ask this question as a lot of python programs by default include a check that a script was not imported. It may be an antipattern, I'm not exactly sure.

  • Comment on Re^2: Using guards for script execution?

Replies are listed 'Best First'.
Re^3: Using guards for script execution?
by kcott (Archbishop) on Mar 01, 2017 at 06:55 UTC
    "... provide a namespace that was not package-level to prevent accidental use of "global" variables."

    I use anonymous namespaces for this. I use them often. They can be nested to an arbitrary depth. Here's a (highly contrived) example:

    my $global; { my $local_outer; # $global known here # $local_outer known here # $local_inner unknown here { my $local_inner; # $global known here # $local_outer known here # $local_inner known here } # $global known here # $local_outer known here # $local_inner unknown here # An entirely different $local_inner: my $local_inner; } # $global known here # $local_outer unknown here # $local_inner unknown here # An entirely different $local_outer: my $local_outer; # An entirely different $local_inner: my $local_inner;

    Beyond avoiding all the issues with global variables, there's addition benefits. When an anonymous block is exited, the lexical variables declared within it, go out of scope and can be garbage collected. Also, if those variables were filehandles, Perl automatically closes them for you. Another contrived example:

    { open my $fh, '<', $filename; # ... read and process file contents here ... # As soon as the closing brace is reached: # 1) $fh goes out of scope - available for garbage collection # 2) an automatic "close $fh" is performed }
    "In practice I have no idea how useful this is and how often that mistake occurs."

    This is very useful, and a practice I recommend using as a default coding technique. It's very often the case that scripts, that start off being very short (e.g. a couple of dozen lines), are enhanced and extended and can end up with hundreds of lines. It's at this point that problems with global variables become apparent: you switch to debugging mode and start changing multiple $text variables, for instance, to $xxx_text, $yyy_text, and so on; then start the test/edit cycle, changing the $text variables missed on previous iterations, fixing incorrect renaming (s/$yyy_text/$xxx_text/) or typos (s/$xxxtext/$xxx_text/), and so on.

    This sort of problem does seem very common. We get lots of "What's wrong with my code?" questions where scoping is the underlying cause.

    Although I've focussed on anonymous namespaces here; the underlying objective is to use lexical variables in the smallest scope possible: that scope could also be provided by, for example, subroutine definitions and BEGIN blocks.

    — Ken

      Actually, my $global; is not global. In your example, it is file scoped, so only accessible in that file and only after the declaration.

Re^3: Using guards for script execution?
by stevieb (Canon) on Feb 28, 2017 at 23:43 UTC

    In Perl, globals are not exported by default anywhere, at any time (including functions and methods). In fact, nothing is. You need to explicitly export them before they can be imported into any other module/package that uses a Perl file, so that safeguard is built in.

    In other words, as I said in my last post, things are file-based scope. Anything in another file that uses a different file does not have implicit access (ie. namespaces won't be clobbered) unless that is specifically and explicitly configured.

    You can include other Perl files to your heart's content, and unless you explicitly export things (from the included file (in Python, an import), you'll never be able to see them within your current namespace.

    Also, useing a Perl file does not execute it, so any executable code you have in a Perl file will not be run when including it into another Perl file. That means that you can have main() code anywhere in a Perl file, even non-scoped (file-level global), that will not be executed or evaluated (into) when including said file in another file.

    I digress a bit. There *are* ways around this, but I believe my fellow monks would agree with me that those are round-about ways, and most definitely not common, standard practice that you'd find in any remotely reasonable example unless you were outright looking for such a way.

      Also, useing a Perl file does not execute it,

      This is SO wrong that I can't leave it uncommented, even if you intended to simplify.

      AnomalousMonk++ already explained that used modules are evaluated.

      To explain that a little bit more: use is an extended require happening at compile time, as if wrapped in a BEGIN block. (BTW: [doc://BEGIN] does not link properly.) require is an extended do FILENAME that prevents loading the same file more than once and checks the result of loading the file. This is why modules need to return a true value. And finally, do FILENAME behaves like an extended eval STRING, where the string is loaded from a file searched in @INC. All of this is documented in perlfunc.

      But there is more:

      Loading any module does not only parse the file and execute code outside the subroutine definitions. Modules can contain five different code blocks (BEGIN, UNITCHECK, CHECK, INIT, and END) that are executed outside the normal flow. Most notably, BEGIN blocks run while code is still being compiled; UNITCHECK and CHECK run after compilation, INIT runs before runtime, END after runtime. This is documented in perlmod.

      Loading a module using the use function has yet another side effect, as documented in use:

      [use Module LIST] is exactly equivalent to

      BEGIN { require Module; Module->import( LIST ); }

      So, in addition to the "magic" code blocks, a method in the module is invoked. (If the module has no import method, the one implicitly inherited from UNIVERSAL is invoked.) Again quoting use:

      If you do not want to call the package's import method (for instance, to stop your namespace from being altered), explicitly supply the empty list:

      use Module ();

      That is exactly equivalent to

      BEGIN { require Module }

      So, loading a module using use, or even using require, actually runs a lot of code in that module:

      demo.pl

      #!/usr/bin/perl use strict; use warnings; BEGIN { print ">>> About to execute 'use LoadMe;' <<<\n"; } use LoadMe; BEGIN { print ">>> LoadMe was loaded <<<\n"; } print "***** See? No code in LoadMe is executed! *****\n";

      LoadMe.pm

      package LoadMe; use strict; use warnings; INIT { print "LoadMe: INIT is evaluated\n"; system "echo Would I run rm -rf / ?"; } CHECK { print "LoadMe: CHECK is evaluated\n"; system "echo Check check, is this thing on?"; } UNITCHECK { print "LoadMe: UNITCHECK is evaluated\n"; system "echo Luckily, format C: does not work on linux"; } BEGIN { print "LoadMe: BEGIN is evaluated\n"; system "echo what could go wrong?"; } END { print "LoadMe: END is evaluated\n"; system "echo Good bye cruel world"; } sub foo { print "LoadMe: foo\n"; } sub import { my $class=shift; print "LoadMe: import called as ${class}->import(",join(',',@_),") +\n"; system "echo Oh well, no smart comment here"; } print "LoadMe: Module initialisation\n"; system "echo Perl is fully working and can run arbitary code, includin +g external programs"; 1; # or any other non-false value

      Demo:

      /tmp/use-demo>perl demo.pl >>> About to execute 'use LoadMe;' <<< LoadMe: BEGIN is evaluated what could go wrong? LoadMe: UNITCHECK is evaluated Luckily, format C: does not work on linux LoadMe: Module initialisation Perl is fully working and can run arbitary code, including external pr +ograms LoadMe: import called as LoadMe->import() Oh well, no smart comment here >>> LoadMe was loaded <<< LoadMe: CHECK is evaluated Check check, is this thing on? LoadMe: INIT is evaluated Would I run rm -rf / ? ***** See? No code in LoadMe is executed! ***** LoadMe: END is evaluated Good bye cruel world /tmp/use-demo>

      Another demo:

      Perl has a -c switch, which has a nice, seemingly harmless first sentence in its documentation:

      -c causes Perl to check the syntax of the program and then exit without executing it.

      Except that this is completely wrong, as explained in the following sentence:

      Actually, it will execute and BEGIN, UNITCHECK, or CHECK blocks and any use statements: these are considered as occurring outside the execution of your program. INIT and END blocks, however, will be skipped.

      So what actually happens is that perl stops after the compile phase has completed, and instead of starting the run phase, it will print a success message unless the compile phase has failed.

      This has consequences. Some code is even executed if you only intent to check the code:

      /tmp/use-demo>perl -cw demo.pl >>> About to execute 'use LoadMe;' <<< LoadMe: BEGIN is evaluated what could go wrong? LoadMe: UNITCHECK is evaluated Luckily, format C: does not work on linux LoadMe: Module initialisation Perl is fully working and can run arbitary code, including external pr +ograms LoadMe: import called as LoadMe->import() Oh well, no smart comment here >>> LoadMe was loaded <<< LoadMe: CHECK is evaluated Check check, is this thing on? demo.pl syntax OK /tmp/use-demo>perl -cw LoadMe.pm LoadMe: BEGIN is evaluated what could go wrong? LoadMe: UNITCHECK is evaluated Luckily, format C: does not work on linux LoadMe: CHECK is evaluated Check check, is this thing on? LoadMe.pm syntax OK /tmp/use-demo>

      Nothing of this is new (except for UNITCHECK, introduced in perl 5.10), but anyone should be aware of those features. Pretending that loading a module does not execute any code in the module does not help.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      You can include other Perl files to your heart's content, and unless you explicitly export things (from the included file (in Python, an import), you'll never be able to see them within your current namespace.

      Unless you fully qualify the symbol name. Package globals are truly global if fully qualified.

      ... useing a Perl file does not execute it, so any executable code you have in a Perl file will not be run when including it into another Perl file. ... main() code ... will not be executed or evaluated (into) when including said file in another file.

      Files that are use-ed (e.g., .pm files) are evaluated. That's where the necessary true return value comes from at the end of the module.

      File MainGuard.pm;

      File execution_of_used_module_1.pl:

      Output:
      c:\@Work\Perl\monks\R0b0t1>perl execution_of_used_module_1.pl Command +Line Args in function MainGuard::main() with arguments MainGuard::main(Command Line Args) AAAAAAAArrrrrgh... at MainGuard.pm line 20. Compilation failed in require at execution_of_used_module_1.pl line 86 +. BEGIN failed--compilation aborted at execution_of_used_module_1.pl lin +e 86.


      Give a man a fish:  <%-{-{-{-<

        You are right on both counts. But, I didn't consider fully qualified vars to be relevant here ;)

        ...and the true val, I didn't think of in this context. I should have.

        Thanks for pointing these out. ++

Re^3: Using guards for script execution?
by RonW (Parson) on Mar 03, 2017 at 00:13 UTC
    provide a namespace that was not package-level to prevent accidental use of "global" variables

    For clarification:

    Variables declared with my are not package variables. At file-level, they are file scoped, so will be accessible from the declaration until the end of file. When declared inside a block, they are only accessible until the end of that block.

    Package variables are declared with our or use vars and are accessible from inside the package they are declared in.

    Also, package variables can be accessed by their fully qualified names from anywhere. So, in that sense, are also "global".

      Right, I'm referring to the unusual case where one accidentally makes use of a variable without redeclaring it and thus unintentionally modifies it in some enclosing scope.

      It doesn't happen often but is something I remember doing.