in reply to Re: unquoted string error??!!
in thread unquoted string error??!!

I realize this is heresy, but saying “it is best to avoid BAREWORD filehandles and 2 argument open” is not as good advice as the boilerplate responses to that effect would have one believe. Advice without explanation is for children.

In particular, the examples given in the very perlintro(1) manpage that you reference indeed recommend that style right at the front:

Files and I/O

You can open a file for input or output using the open function. It’s documented in extravagant detail in perlfunc and perlopentut, but in short:
open(INFILE, "input.txt") or die "Can't open input.txt: $!"; open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!"; open(LOGFILE, ">>my.log") or die "Can't open logfile: $!";
...et cetera...
Considering that those are the only sorts of I/O examples that you’ll find in the perlintro(1) manpage, I don’t understand the misconnect between recommending against it and recommending for it.

The only thing really “wrong” with those cited examples out of perlintro(1) in a modern text-processing environment is that they neglect the encoding, which can be remedied with a use open pragma — amongst several other ways, such as the newish PERLIO envariable or via post-facto binmoding as it has always been done.

People have been programming Perl this way for more than two decades now. There is no need to go all PC-police on people for code that works perfectly well for their purposes. There are millions of lines of working Perl code out there that work in just this way.

Yes, there are times when more dedicated, non-shell-like constructs are more suitable.

But this is not one of them.

Replies are listed 'Best First'.
Re^3: unquoted string error??!!
by runrig (Abbot) on May 04, 2011 at 16:20 UTC

    Here is my (humble) opinion on why it is best to avoid bareword file handles...(Update: except for STDIN, STDERR, DATA, ARGV, and the like...)

    While it doesn't matter in a situation as simple as in the OP, in a non-trivial situation, if you use a lexical file handle (and 'use strict'), you can avoid the following bug:

    #!/usr/bin/perl use strict; use warnings; my $file = "file.txt"; CreateFile($file); system frobnicate => $file; sub CreateFile { my $f = shift; open(OUTPUT_FILE, ">$f") or die "Err: $!"; for (1..100) { print OUTPUT_FILE "$_\n"; } # blah, blah, blah... close OUTPUT_FH }
    This exact sort of thing happened here@work, where there is a serious lack of lexical file handle use, and took a non-zero amount of time to debug. It also failed to influence anyone's decision in choice of file handle type :-(
      Here is my (humble) opinion on why it is best to avoid bareword file handles...
      Yeah, let's avoid reading from STDIN, or avoid writing to STDOUT or STDERR. Or make use of the magical handles DATA or ARGV.

      As long as the three most important filehandles (important enough that by default (at least on Unix), all processes will have them) are bare file handles, I cannot stop thinking "uttered by someone with limited knowledge of Perl" when hearing advice like "is best to avoid bareword file handles".

      As for you example bug; first of all, your use of warnings would have caught any typos in filehandle names - close OUTPUT_FH generates the warning Name "main::OUTPUT_FH" used only once: possible typo at: .... Furthermore, the above does have a serious bug: it's not checking the return value of closed. If it did, it would have notices the failure of closing a handle that wasn't opened.

        Yeah, let's avoid reading from STDIN, or avoid writing to STDOUT or STDERR.

        Oh, c'mon. Oversight due to generalization, updated post.

        And to your other points: Yes, warnings are easy to notice in programs with very little output. And as noted elsewhere, there may or may not have been only a 'used once' warning in the actual code. And whether or not you catch the error in closing a handle that's already closed, it's better to catch the error at 'compile' time rather than runtime. Also note, "IT'S NOT MY CODE", and I consider myself lucky if others check the return value of open and include $! in the error message. If they start checking the return value of close, I might die of shock.

      If you willfully ignore all warnings and meaningful return values, then of course you should expect to have inherently buggy code:
      % perl -Mautodie /tmp/buggy Name "main::OUTPUT_FILE" used only once: possible typo at /tmp/buggy l +ine 16. Can't close filehandle 'OUTPUT_FH': 'Bad file descriptor' at /tmp/bugg +y line 19
      Did you somehow expect something else? Perl gives you the tools to diagnose and debug such buggery. Do not blame Perl if you ignore its prudent advice.

      PROOF: Lexicals Filehandles Aren’t

      I have a problem with this whole “lexical filehandle” folderol. That problem is that the name does not fit the thing.

      • This is not a lexical filehandle:
        use vars qw($fh); undef $fh; open($fh, "> /tmp/data.$$") || die "can't open /tmp/data.$$: $!"); print $fh "I am so NOT a lexical filehandle.\n"; close($fh) || die "can't close /tmp/data.$$: $!";
      • This is a lexical filehandle:
        my $fh = *STDOUT; print $fh "I am SO TOO lexical filehandle.\n";

      I therefore submit that the thing you are talking about is not “lexical filehandles”, and to call them what they are not is to risk introduction of bugs in one’s mental model.

      I believe what you are referring to is not “lexical handles”, but instead autovivified anonymous handles, which may — or may not —happen to be stored in lexical variables.

        Ok, you're right, there is a warning here. I think though that the actual code closed another handle that was already closed from some other function. And if there was a warning, it would have been in a log file with lots of other output (though probably at the top of the file). And autodie may caught it also, but we're using 5.8.8, so autodie is not in core, but since we tend to trap and email errors, that would probably require eval, and if you think I have trouble getting people to use lexical handles (or AAH's which happen to be stored in lexical variables...), just watch me try to get anyone to use eval.

        So sure, there are lot's of ways to catch this bug, but I think the simplest way to catch it dead in it's tracks at 'compile' time would have been to use a lexical handle.

        OR, say the subroutine had 2 handles, and you closed one of them twice. The other would be closed when the sub exited, so you'd still have 'bad' code, but no bug.

Re^3: unquoted string error??!!
by chromatic (Archbishop) on May 04, 2011 at 18:33 UTC
    Advice without explanation is for children.

    What's the best way, in your estimation, to explain to novice Perl 5 programmers that, given the vagaries and heuristics of S_intuit_method in toke.c, it's often easier to reason about the local and global effects of any individual unit of code if there are no barewords?

    (I can count on my fingers the number of people who may be able to explain all eight rules for S_intuit_method without having to consult the code as a refresher, and I might be overestimating the number of people so qualified. If anyone, you're one of them, but I don't count on having all novices as experienced or ready to understand in full as you.)

    This isn't about political correctness. It's about reducing the possibility of error in the same way that explaining that always, always, always adding a space between the file open mode and the filename in the two-argument form is as much a pattern for people to emulate as using the three-argument form. You know as well as anyone that novices tend to emulate examples without understanding them fully. Encouraging them to prefer constructs where, for example, the lack of an invisible character has no potential security flaws, seems to me to be more useful.

      chromatic wrote:
      What's the best way, in your estimation, to explain to novice Perl programmers that, given the vagaries and heuristics of S_intuit_method in toke.c, it's often easier to reason about the local and global effects of any individual unit of code if there are no barewords?

      Sure, but there are even more important places where this issue arises. See below.

      (I can count on my fingers the number of people who may be able to explain all eight rules for S_intuit_method without having to consult the code as a refresher, and I might be overestimating the number of people so qualified. If anyone, you're one of them, but I don't count on having all novices as experienced or ready to understand in full as you.)

      Here, take a shot at it yourself. ☻ Your goal is to uncomment just one from each set of sorted alternatives to produce this exact output:

      fileno is 2 fileno is 2 That's all, folks!
      Here’s the code:
      sub new { die "$0: main::new() sub called\n"; } sub Class { die "$0: main::Class() sub called\n"; } ### PICK EXACTLY ONE OF: #1# local *new ; #2# local *new = Class new ; #3# local *new = Class->new() ; #4# local *new = Class::->new() ; #5# local *new = new Class ; #6# local *new = new Class:: ; ### PICK EXACTLY ONE OF: #1# open( new, "> &=STDERR") || die; #2# open( new, "> &STDERR") || die; #3# open( new, ">& STDERR") || die; #4# open( new, ">&", *STDERR) || die; #5# open( new, ">&=STDERR") || die; #6# open(*new, ">&=STDERR") || die; #7# open(*new, ">&", *STDERR) || die; my $fd = fileno(*new) // die "no fileno"; die "wrong fileno" unless $fd == 2; my $output = "fileno is $fd\n"; syswrite(*main::STDOUT, $output, length($output)); ### PICK EXACTLY ONE OF: #1# (print new $output ) || die; #2# (print "new" $output ) || die; #3# (print *new $output ) || die; #4# (print ::new $output ) || die; #5# (print {*new} $output ) || die; ### PICK EXACTLY ONE OF: #1# close( new ) ? done() : die "can't close new: $!"; #2# close( new::) ? done() : die "can't close new: $!"; #3# close( "new" ) ? done() : die "can't close new: $!"; #4# close( *new ) ? done() : die "can't close new: $!"; #5# close( ::new ) ? done() : die "can't close new: $!"; #6# close(*::new ) ? done() : die "can't close new: $!"; sub done { print "That's all, folks!\n"; } package Class; sub new { require "IO/Handle.pm"; return IO::Handle::->new(); }
      Good luck. :)

      This isn't about political correctness. It's about reducing the possibility of error in the same way that explaining that always, always, always adding a space between the file open mode and the filename in the two-argument form is as much a pattern for people to emulate as using the three-argument form.
      Yes, ok: that’s sound advice. One problem is people are unclear on what a mode even is. See all the flailing above.
      You know as well as anyone that novices tend to emulate examples without understanding them fully.
      There are something like 650 uses of vintage filehandles in the standard documentation set, and perhaps 400 such uses in the Camel. What do you propose to do, change all those to meet the new purity laws? We already had handle autovivification in 5.6.1, and didn’t see fit to do so then. Has something changed since then? Or was that a terrible blunder? And what should be done in future? You perceive this weighs on my mind.

      And what do you do when a user comes to you unhappy that the old standard copy pattern:

      print OUTPUT while <INPUT>;
      Has no corresponding clean translation? How do you explain that one to them? There is no nice story here: lexical filehandles can too easily break several standard practices that people have come to rely on more than they realize. It is never a pleasant task to explain these strange failures that can result.

      And even if the thousand or so uses of vintage filehandles were expunged from the online docs and the Camel (which I personally find to be a terrifically intimidating amount of work — which I do not care to sign myself up to!), what then do you do about the millions of uses of them in existing code that are already out there? Ban them? The current doc policy seems to be to remove all mention of things “we don’t like”; how does that serve the public good?

      Encouraging them to prefer constructs where, for example, the lack of an invisible character has no potential security flaws, seems to me to be more useful.
      Security flaws? Don’t you think that’s unnecessarily overstating things? That’s like saying that the old rename script has security flaws:
      $op = shift() || die; for (@ARGV) { $was = $_; eval $op; die if $@; rename($was, $_) || die "rename: $!" unless $was eq $_; }
      The point here is that if you are allowing untrusted antagonists to specify the arguments to your syscalls, then you have bigger problems than mode bits.

      That said, I was aghast to find this embarrassing silliness still in perlfunc:

      If you want to select on many filehandles, you may wish to write a subroutine like this:
      sub fhbits { my(@fhlist) = split(' ',$_[0]); my($bits); for (@fhlist) { vec($bits,fileno($_),1) = 1; } $bits; } $rin = fhbits('STDIN TTY SOCK');
      I of course fixed it to read what it should have read since oh, probably perl4 or so:
      sub fhbits { my @fhlist = split(" ", $_[0]); my $bits ; for (@fhlist) { vec($bits, fileno($_), 1) = 1; } return $bits; } $rin = fhbits(*STDIN, *TTY, *SOCK);
      Filehandles seem to me to be the least of several worrisome bareword issues. In front of that concern come not just bareword strings but most especially an agonizing confusion as to what is and what is not a subroutine call, a method invocation, or even a class name. Doesn’t that bother you?

      Used reasonably, vintage filehandles work perfectly well. I’m not sure the same can be said of those others. One needn’t resort to such games as this one:

      no strict; no warnings; no less tricksy; foo(lish); his::bar(tab); silly->stuff; come on, please give up; package UNIVERSAL; sub AUTOLOAD { print "I am masquerading as $AUTOLOAD(@_)\n" }
      to realize there’s a massively bigger bareword problem lurking right in front of us than that of filehandles alone. All the various multiple‐choice alternatives in my long code segment above should have by now made that starkly clear.

        Now, now boys. Cocks away. Urea has a horrible affect upon lime mortar.

        99% of Perl users will go through their lives without ever needing to know these obscure details. And on the rare occasions they ar bit by them, they'll juggle their code a bit and fix their problem without ever understanding the deep reasons behind what the did.

        There is a temptation amongst those with deep inside knowledge to want to warn of all the traps and dangers--like over protective mothers with bikes & skateboards & horses. But conveying *all* the dangers and possible consequences in detail is just too time consuming, so they seek to codify them into simplified rule sets that must be blanket imposed on all.

        They forget, that newbies like kids, need to learn from their own mistakes. It's not just a right of passage, but the best way to learn. They learn not just the solution to their mistakes, but how to approach solving problems. And how to form their own judgements about which precautions are always valuable, and which are just motherly paranoia.

        Take that learning process away from them and you end up with rote-learnt semi-experienced programmers that are completely out of their depth the moment something slightly beyond your simplified rules comes along. Dead in the water with no problem solving skills to fall back on.

        And when interviewing, they have no answers to the why did you do that questions.

        What do you propose to do, change all those to meet the new purity laws?

        Certainly not, even if this were about purity or had a the force of fiat.

        However, for those documents which I do create or edit, I prefer to explain those practices I perceive to be "better" (in any or every sense of less code, fewer side effects, more secure, easier to read, better encapsulated, or subjectively more pleasing aesthetically) and, having explained my reasoning, explain the alternative (and, let's be fair, often more historical) approaches.

        The current doc policy seems to be to remove all mention of things “we don’t like”; how does that serve the public good?

        You may be overstating things; it's certainly easy to find voluminous examples of multiple approaches pre- and post-5.6.0 throughout the documentation. If there were a diktat to scrub from history even the lingering scent of package global typeglob filehandles, the best one could say about it is that at least it moves slowly.

        Security flaws? Don’t you think that’s unnecessarily overstating things?

        Only in the sense that local privilege escalation errors are less bothersome than remote privilege escalation errors. Certainly I hope that the last line of defence never comes down to the presence or absence of a space between file mode and filename in the two-argument form of open, but I use the three-argument form on my own pervasively so I never have to worry about it not being a line of defence.

        Filehandles seem at to me to be the least of several worrisome bareword issues.

        Predeclared (but not imported) subs are less a concern to me than bareword filehandles, but even though I know (most of) the rules of bareword disambiguation, I don't trust myself to remember all of the possible ways code I write could fall afoul of the mismatch between my heuristics and those of toke.c. Certainly careful forethought helps as do good habits and plenty of practice, but given a reasonable and well-distributed alternative that has other advantages, my preference is clear.

        I'd even write class names such as My::Class::, if one percent of CPAN authors also did so. Alas, that disambiguation is so relatively unknown.

        And what do you do when a user comes to you unhappy that the old standard copy pattern:
        print OUTPUT while <INPUT>;
        Has no corresponding clean translation? HHow do you explain that one to them? There is no nice story here: lexical filehandles can too easily break several standard practices that people have come to rely on more than they realize. It is never a pleasant task to explain these strange failures that can result.

        Can you explain what you mean?