Whitespace parsing happens before variable parsing in every bourne-ish shell I've used since the late 70s.
I guess the intersection of the sets of shells we have used is empty then. Anyway, here's the relevant portion of IEEE Std 1003.1. From section 2.6:
The order of word expansion shall be as follows:
-
Tilde expansion (see Tilde Expansion), parameter expansion (see Parameter Expansion), command substitution (see Command Substitution), and arithmetic expansion (see Arithmetic Expansion) shall be performed, beginning to end. See item 5 in Token Recognition.
- Field splitting (see Field Splitting) shall be performed on the portions of the fields generated by step 1, unless IFS is null.
- Pathname expansion (see Pathname Expansion) shall be performed, unless set -f is in effect.
- Quote removal (see Quote Removal) shall always be performed last.
As you see, parameter expansion happens before word splitting.
Here's the relevant section from the bash manual:
The order of expansions is: brace expansion, tilde expansion, parame-
ter, variable and arithmetic expansion and command substitution (done
in a left-to-right fashion), word splitting, and pathname expansion.
Of course you say "New fangled things! GNU, POSIX, who needs them! V7, that's what real men use." So be it. From the Unix V7 manual:
Blank interpretation
After parameter and command substitution, any result of substitution are
scanned for internal field separator characters (those found in $IFS)
and split into distinct arguments where such characters are found. Explicit
null arguments ("" or '') are retained. Implicite null
arguments (those resulting from parameters that have no values)
are removed.
Now, I don't want to claim you are wrong, but if you have never programmed in the Unix V7 shell, GNU bash, or a POSIX compliant shell, which shells have you used since the 70s?
As for "my" syntax:
< $file | wc -l
you erroneously put an extra pipe in there. Remove it, try again, and give yourself minus 1 point for bad copying.
You're right. Think it will help, removing that pipe? Let's find out!
$ echo "hello" > data1
$ echo "world" > data2
$ file="data1 data2"
$ < $file wc -l
bash: $file: ambiguous redirect
Nope. Guess my "useless cat" is still very very useful. | [reply] [d/l] [select] |
Forgeting to write "$x" instead of $x is a classic shell programming mistake which results in things breaking for strings that contain whitespace. And it has been a classic mistake since the '70s.
If I were hiring for a job that required shell programming, that'd be one of the questions I'd ask.
| [reply] |
However, the second part does work on real shells, just not on bash or csh.
I presume you mean with "real shells", your current favourite shell, "zsh". You are only partially right. You are right that
the syntax works, but not the semantics. In
file="data1 data2"
<$file wc -l
zsh does not give you the number of lines in the files "data1" and "data2". Instead, it gives you the number of lines of the file (singular) "data1 data2". The use of cat isn't going to save the day though,
file="data1 data2"
cat $file | wc -l
also gives a count of the number of lines in the file "data1 data2".
No doubt zsh has a way of getting the count of lines from both files, after all, zsh is supposed to have every feature under the sun and then some, but it's not <$file. | [reply] [d/l] [select] |
What shell were you using that didn't give you a count of the lines in both data1 and data2?
I tested the last snippet under Linux-x86 (Slackware 10), and NetBSD-sparc 1.6.2, and using bash and ksh93. In both cases I got a count of the lines in both files data1 and data2.
Those of you seeing a count of the file "data1 data2" instead need to document what shells and systems you are seeing this on.
| [reply] |