Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Summing numbers in a file

by jcb (Parson)
on May 31, 2020 at 01:01 UTC ( [id://11117517]=note: print w/replies, xml ) Need Help??


in reply to Re: Summing numbers in a file
in thread Summing numbers in a file

the preferred style is to use a scalar variable for a file handle: open(my $N, '<', $filename);

I will agree with this in the context of a sub that opens a file and closes it before returning, or that returns a file handle. Indeed, I would argue that using a global handle in either of those cases is incorrect. However, I will quibble with this at top-level as in this case: there is no functional difference between the lexical file handles in your example and the traditional global file handles — in both cases, a handle opened at top-level is defined until the end of the script and valid until closed.

Please correct me if I am somehow misinformed about this.

PS: Our questioner forgot error handling, so I will point out that open ... or die is an important Perl idiom that is hidden in your example behind the autodie pragma.

Replies are listed 'Best First'.
Re^3: Summing numbers in a file
by Athanasius (Archbishop) on May 31, 2020 at 07:07 UTC

    Hello jcb,

    However, I will quibble with this at top-level as in this case: there is no functional difference between the lexical file handles in your example and the traditional global file handles — in both cases, a handle opened at top-level is defined until the end of the script and valid until closed.

    Well, for this particular script, that is quite true. And actually, my advice was only meant as indicating good practice in general. However, it is easy to show that a lexical file handle may be a better choice even at the top level:

    Yes, this is a highly contrived example. But then, why take even remote risks when they can be easily eliminated by the consistent application of good practice?

    Hope that’s of interest,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Do I misunderstand, or is the colliding file handle actually Foo::FH in the example? In this example, I would say that the problem is not the use of a global file handle, but the main script placing its code into package Foo and calling frobnicate incorrectly. The use of subroutine prototypes would either make the bug in frobnicate obvious or raise a compile-time error at line 18 when it is called with too many arguments.

      I argue that lexical file handles are a neutral matter of style when they are declared at top-level (which is normally limited to the main script because modules typically provide subs but do not execute code upon loading). The real problem in the contrived example is calling a subroutine with the wrong number of arguments.

      In a case where the file handle is intended to be an "environment parameter" to a subroutine, global file handles are the only option, but please do not actually do that in production code, or at least very clearly document routines that expect certain global file handles to be set up by their callers.

        Do I misunderstand, or is the colliding file handle actually Foo::FH in the example?

        Yes, that’s correct. The example was intended to show a possible pitfall of using a package global variable instead of a lexical variable.

        In this example, I would say that the problem is not the use of a global file handle, but the main script placing its code into package Foo and calling frobnicate incorrectly. The use of subroutine prototypes would either make the bug in frobnicate obvious or raise a compile-time error at line 18 when it is called with too many arguments.

        Well, the idea was that the call to frobnicate was actually what was intended, but the implementation of the subroutine in Foo.pm was erroneous. So if prototypes were used, the sub might well have the correct prototype — sub frobnicate ($) — and then still fail without error or warning. (But anyone using prototypes is strongly advised to familiarize themselves with Far More than Everything You've Ever Wanted to Know about Prototypes in Perl -- by Tom Christiansen first.)

        But all this is a bit beside the point; I did say the example was “highly contrived.” :-) The point I was trying to make is that variables should — as a rule of good practice — be given the minimum scope needed, and no more.1 Think Defensive programming. Another important principle is this: Wherever possible, prefer compile-time errors to run-time errors.2 If you can make the compiler do your debugging for you, you’ll save both time and effort. So in the case under discussion, by limiting the scope of the filehandle variable you can harness the compiler to check for errors which you would otherwise have to hunt down yourself.

        In the end, as always in Perl, TMTOWTDI. But, what do you gain by using a package global variable for a filehandle where a lexical variable would do the same job?

        1I consider this a corollary of the Principle of least privilege, although strictly speaking that principle has a somewhat narrower focus.
        2This principle is set forth by Scott Meyers in one of his Effective C++ books; I don’t have the reference to hand.

        [As I was about to post this, I found that haukex had already made a similar case in his excellent reply Re^5: Summing numbers in a file. I’ll post mine anyway, but please read it in conjunction with the post by haukex.]

        Cheers,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re^3: Summing numbers in a file
by haukex (Archbishop) on May 31, 2020 at 09:18 UTC
    there is no functional difference between the lexical file handles in your example and the traditional global file handles

    Please see this recent discussion about lexical vs. bareword filehandles. In particular, open(N, "<", $filename); chomp (my $numbers = <M>); only gives warnings, while open(my $N, "<", $filename); chomp (my $numbers = <$M>); is of course a fatal error under strict. Also, bareword filehandles clash with package names.

      That recent discussion involved "pseudo-lexical" file handles using local. I agree that that is a bad idea, but maintain that bareword file handles are reasonable in top-level code. (Modules do not normally contain top-level code.)

      While the typo-catching features of use strict are helpful, you should not be using file handle names that are that easily confused in the first place. In particular, FH is suitable for examples of I/O code, but should not be used in real programs as a global. Global file handles should have meaningful names. For example, I recently wrote code that imports a text-format package manifest into a database; the file is read using a handle named MANIFEST.

      bareword filehandles clash with package names

      I presume that is the origin of the convention of always writing global file handles in all UPPERCASE, since package names are (with few exceptions, like UNIVERSAL) always mixed-case (or lowercase for pragmas) by convention?

        First of all, note that all of this is in the context of what advice to give wisdom seekers. You're of course free to code however you like.

        That recent discussion involved "pseudo-lexical" file handles using local.

        I know, but many of the issues discussed still apply. And again, I'll point out that lexical filehandles solve all of the issues discussed here. I'll also ask the same thing as I did in that thread: I've named some disadvantages, what are the advantages that you see to using bareword filehandles?

        bareword file handles are reasonable in top-level code. (Modules do not normally contain top-level code.)

        The issue is not where the code is, i.e. whether it's "top-level" or not, it's action at a distance: a module may load another module that may load another module that may do something that clashes with a global the main code is using; those issues are not fun to debug.

        While the typo-catching features of use strict are helpful, you should not be using file handle names that are that easily confused in the first place.

        Sorry, but how is this argument different from "you don't need strict as long as you don't make typos"?

        Let me pull together several quotes from your replies in this subthread and add some emphasis to try to point out a theme:

        In this example, I would say that the problem is not the use of a global file handle, but the main script placing its code into package Foo and calling frobnicate incorrectly.

        The use of subroutine prototypes would either make the bug in frobnicate obvious...

        The real problem in the contrived example is calling a subroutine with the wrong number of arguments.

        which is normally limited to the main script because modules typically provide subs but do not execute code upon loading ... Modules do not normally contain top-level code.

        please do not actually do that in production code, or at least very clearly document

        ... you should not be using file handle names that are that easily confused in the first place.

        In particular, FH is suitable for examples of I/O code, but should not be used in real programs as a global.

        Global file handles should have meaningful names.

        I presume that is the origin of the convention of always writing global file handles in all UPPERCASE, since package names are (with few exceptions, like UNIVERSAL) always mixed-case (or lowercase for pragmas) by convention?

        Of course the normal convention is that everyone should write correct, bug-free code! ;-P Update: Just to be clear, the theme I see here is that you seem to be placing a lot of expectations on people to write correct code, when simply using lexical filehandles easily provides protection from the issues. /Update

        (By the way, Prototypes are often discouraged now except when used to change how subroutine calls are parsed.)

        Speaking of your other post:

        In a case where the file handle is intended to be an "environment parameter" to a subroutine, global file handles are the only option

        Sorry, but I don't get this - what do you mean with an "environment parameter"? And I very strongly disagree with "only option".

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11117517]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-03-28 13:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found