in reply to Re^3: unquoted string error??!!
in thread unquoted string error??!!

chromatic wrote:
What's the best way, in your estimation, to explain to novice Perl programmers that, given the vagaries and heuristics of S_intuit_method in toke.c, it's often easier to reason about the local and global effects of any individual unit of code if there are no barewords?

Sure, but there are even more important places where this issue arises. See below.

(I can count on my fingers the number of people who may be able to explain all eight rules for S_intuit_method without having to consult the code as a refresher, and I might be overestimating the number of people so qualified. If anyone, you're one of them, but I don't count on having all novices as experienced or ready to understand in full as you.)

Here, take a shot at it yourself. ☻ Your goal is to uncomment just one from each set of sorted alternatives to produce this exact output:

fileno is 2 fileno is 2 That's all, folks!
Here’s the code:
sub new { die "$0: main::new() sub called\n"; } sub Class { die "$0: main::Class() sub called\n"; } ### PICK EXACTLY ONE OF: #1# local *new ; #2# local *new = Class new ; #3# local *new = Class->new() ; #4# local *new = Class::->new() ; #5# local *new = new Class ; #6# local *new = new Class:: ; ### PICK EXACTLY ONE OF: #1# open( new, "> &=STDERR") || die; #2# open( new, "> &STDERR") || die; #3# open( new, ">& STDERR") || die; #4# open( new, ">&", *STDERR) || die; #5# open( new, ">&=STDERR") || die; #6# open(*new, ">&=STDERR") || die; #7# open(*new, ">&", *STDERR) || die; my $fd = fileno(*new) // die "no fileno"; die "wrong fileno" unless $fd == 2; my $output = "fileno is $fd\n"; syswrite(*main::STDOUT, $output, length($output)); ### PICK EXACTLY ONE OF: #1# (print new $output ) || die; #2# (print "new" $output ) || die; #3# (print *new $output ) || die; #4# (print ::new $output ) || die; #5# (print {*new} $output ) || die; ### PICK EXACTLY ONE OF: #1# close( new ) ? done() : die "can't close new: $!"; #2# close( new::) ? done() : die "can't close new: $!"; #3# close( "new" ) ? done() : die "can't close new: $!"; #4# close( *new ) ? done() : die "can't close new: $!"; #5# close( ::new ) ? done() : die "can't close new: $!"; #6# close(*::new ) ? done() : die "can't close new: $!"; sub done { print "That's all, folks!\n"; } package Class; sub new { require "IO/Handle.pm"; return IO::Handle::->new(); }
Good luck. :)

This isn't about political correctness. It's about reducing the possibility of error in the same way that explaining that always, always, always adding a space between the file open mode and the filename in the two-argument form is as much a pattern for people to emulate as using the three-argument form.
Yes, ok: that’s sound advice. One problem is people are unclear on what a mode even is. See all the flailing above.
You know as well as anyone that novices tend to emulate examples without understanding them fully.
There are something like 650 uses of vintage filehandles in the standard documentation set, and perhaps 400 such uses in the Camel. What do you propose to do, change all those to meet the new purity laws? We already had handle autovivification in 5.6.1, and didn’t see fit to do so then. Has something changed since then? Or was that a terrible blunder? And what should be done in future? You perceive this weighs on my mind.

And what do you do when a user comes to you unhappy that the old standard copy pattern:

print OUTPUT while <INPUT>;
Has no corresponding clean translation? How do you explain that one to them? There is no nice story here: lexical filehandles can too easily break several standard practices that people have come to rely on more than they realize. It is never a pleasant task to explain these strange failures that can result.

And even if the thousand or so uses of vintage filehandles were expunged from the online docs and the Camel (which I personally find to be a terrifically intimidating amount of work — which I do not care to sign myself up to!), what then do you do about the millions of uses of them in existing code that are already out there? Ban them? The current doc policy seems to be to remove all mention of things “we don’t like”; how does that serve the public good?

Encouraging them to prefer constructs where, for example, the lack of an invisible character has no potential security flaws, seems to me to be more useful.
Security flaws? Don’t you think that’s unnecessarily overstating things? That’s like saying that the old rename script has security flaws:
$op = shift() || die; for (@ARGV) { $was = $_; eval $op; die if $@; rename($was, $_) || die "rename: $!" unless $was eq $_; }
The point here is that if you are allowing untrusted antagonists to specify the arguments to your syscalls, then you have bigger problems than mode bits.

That said, I was aghast to find this embarrassing silliness still in perlfunc:

If you want to select on many filehandles, you may wish to write a subroutine like this:
sub fhbits { my(@fhlist) = split(' ',$_[0]); my($bits); for (@fhlist) { vec($bits,fileno($_),1) = 1; } $bits; } $rin = fhbits('STDIN TTY SOCK');
I of course fixed it to read what it should have read since oh, probably perl4 or so:
sub fhbits { my @fhlist = split(" ", $_[0]); my $bits ; for (@fhlist) { vec($bits, fileno($_), 1) = 1; } return $bits; } $rin = fhbits(*STDIN, *TTY, *SOCK);
Filehandles seem to me to be the least of several worrisome bareword issues. In front of that concern come not just bareword strings but most especially an agonizing confusion as to what is and what is not a subroutine call, a method invocation, or even a class name. Doesn’t that bother you?

Used reasonably, vintage filehandles work perfectly well. I’m not sure the same can be said of those others. One needn’t resort to such games as this one:

no strict; no warnings; no less tricksy; foo(lish); his::bar(tab); silly->stuff; come on, please give up; package UNIVERSAL; sub AUTOLOAD { print "I am masquerading as $AUTOLOAD(@_)\n" }
to realize there’s a massively bigger bareword problem lurking right in front of us than that of filehandles alone. All the various multiple‐choice alternatives in my long code segment above should have by now made that starkly clear.

Replies are listed 'Best First'.
Re^5: unquoted string error??!!
by Anonymous Monk on May 05, 2011 at 05:17 UTC

    Now, now boys. Cocks away. Urea has a horrible affect upon lime mortar.

    99% of Perl users will go through their lives without ever needing to know these obscure details. And on the rare occasions they ar bit by them, they'll juggle their code a bit and fix their problem without ever understanding the deep reasons behind what the did.

    There is a temptation amongst those with deep inside knowledge to want to warn of all the traps and dangers--like over protective mothers with bikes & skateboards & horses. But conveying *all* the dangers and possible consequences in detail is just too time consuming, so they seek to codify them into simplified rule sets that must be blanket imposed on all.

    They forget, that newbies like kids, need to learn from their own mistakes. It's not just a right of passage, but the best way to learn. They learn not just the solution to their mistakes, but how to approach solving problems. And how to form their own judgements about which precautions are always valuable, and which are just motherly paranoia.

    Take that learning process away from them and you end up with rote-learnt semi-experienced programmers that are completely out of their depth the moment something slightly beyond your simplified rules comes along. Dead in the water with no problem solving skills to fall back on.

    And when interviewing, they have no answers to the why did you do that questions.

      ... newbies like kids, need to learn from their own mistakes. It's not just a right of passage, but the best way to learn.

      I thought Matt's Script Archives put that little fib to rest back in the year 19100.

        Mrs Brown's sister's neice knew a woman who's son was playing on a skateboard in his own backyard. He was wearing a helmet, neck brace, lumbar support, gloves, kneepads and groinbox, and his still died.

        But mum, he was crushed by a dead cow falling out of the vortex of a tornado.

        I don't care. Skateboards are too dangerous.

Re^5: unquoted string error??!!
by chromatic (Archbishop) on May 05, 2011 at 00:24 UTC
    What do you propose to do, change all those to meet the new purity laws?

    Certainly not, even if this were about purity or had a the force of fiat.

    However, for those documents which I do create or edit, I prefer to explain those practices I perceive to be "better" (in any or every sense of less code, fewer side effects, more secure, easier to read, better encapsulated, or subjectively more pleasing aesthetically) and, having explained my reasoning, explain the alternative (and, let's be fair, often more historical) approaches.

    The current doc policy seems to be to remove all mention of things “we don’t like”; how does that serve the public good?

    You may be overstating things; it's certainly easy to find voluminous examples of multiple approaches pre- and post-5.6.0 throughout the documentation. If there were a diktat to scrub from history even the lingering scent of package global typeglob filehandles, the best one could say about it is that at least it moves slowly.

    Security flaws? Don’t you think that’s unnecessarily overstating things?

    Only in the sense that local privilege escalation errors are less bothersome than remote privilege escalation errors. Certainly I hope that the last line of defence never comes down to the presence or absence of a space between file mode and filename in the two-argument form of open, but I use the three-argument form on my own pervasively so I never have to worry about it not being a line of defence.

    Filehandles seem at to me to be the least of several worrisome bareword issues.

    Predeclared (but not imported) subs are less a concern to me than bareword filehandles, but even though I know (most of) the rules of bareword disambiguation, I don't trust myself to remember all of the possible ways code I write could fall afoul of the mismatch between my heuristics and those of toke.c. Certainly careful forethought helps as do good habits and plenty of practice, but given a reasonable and well-distributed alternative that has other advantages, my preference is clear.

    I'd even write class names such as My::Class::, if one percent of CPAN authors also did so. Alas, that disambiguation is so relatively unknown.

      You’re right that imported subs are the primary issue, since that’s too much like strange action at a distance. I don’t expect to hose myself so obviously as in the code I showed. There isn’t really a good solution, or a really good solution, or something like that.

      As for using package-quoted names like IO::Handle::, I actually do it pretty regularly; it automatically cleans up problems that most people don’t ever want to know about.

      We’ve known of this approach for a very long time now, so there’s no excuse not to use it. Perhaps better stated, I don’t consider general CPAN ignorance of it to be any excuse not to use it. What you don’t know can hurt you.

      I’m sure there is no shortage of CPAN authors who neglect to check the return values of their close calls, too, but that isn’t going to stop me from religiously doing so. I try to apply more rigorous standards of correctness (or even advisability) than simply looking to majority practice. That’s too much like using Google to rank various misspellings to decide which one “must” be right.

      Try it, you’ll like it. Who knows, you could even be a trend-setter.

        The only places I've ever seen bareword class names go wrong in practice is with single-element names such as CGI or (recently) JSON. Naming conventions can alleviate that, but that might be a good place to start.

        I remember the other reason I hesitated to do so: aliased. I'm not sure whether that's a pro or a con.

Re^5: unquoted string error??!!
by Anonymous Monk on May 05, 2011 at 08:32 UTC
    And what do you do when a user comes to you unhappy that the old standard copy pattern:
    print OUTPUT while <INPUT>;
    Has no corresponding clean translation? HHow do you explain that one to them? There is no nice story here: lexical filehandles can too easily break several standard practices that people have come to rely on more than they realize. It is never a pleasant task to explain these strange failures that can result.

    Can you explain what you mean?