I was anticipating this very answer, but didn't want to clobber my first post and hoped I wouldn't have to write this reply. :-)
Short version:
It's unnecessary to mention leading empty fields in that paragraph as default behaviour because
See Re^2: splitting nothing? for a suggested documentation patch.
Update: This doesn't change that clarifications on how split() works shouldn't be done. I'm just arguing that adding yet another rule to how it works isn't the way to go and by removing the sentence in question we actually make the documentation of split clearer.
The really long version for the particularly interested:
As I see it, there are at least two ways to solve this. One way is that we do as the patch at Re^3: splitting nothing? does and introduce yet more complexity by saying that empty leading fields that also are empty trailing fields aren't empty leading fields but empty trailing fields. Another is to attack the problem at the root and not confuse the reader with leading empty fields at all.
You can tell split() to not ignore trailing empty fields. However, you cannot tell split() to disregard leading empty fields in the general case--it's only done for a particular case (if one choose to look at it as removal of empty fields rather than skipping of leading whitespaces--see below). For me, it's more confusing to say that it's a default behaviour instead of just documenting the special case.
This "undefault" behaviour is already explained in the documentation:
If PATTERN is also omitted, splits on whitespace (after skipping any leading whitespace).
As we see, the documentation already resolves this issue by saying that for this special case the leading whitespaces are skipped rather than first splitting on them and then removing the resulting empty leading field. (My english isn't good enough to judge whether the documentation should put whitespace in plural or singular and if the documentation can be interpreted to split on /\s/ rather than /\s+/.)
split; is equivalent to do { split /\s+/, /\s*(.*)/s && $1 } for defined values of $_. The /\s+/ pattern would at most produce one leading empty field which makes it excessive and confusing to talk about leading empty fields in pluralis.
This is further explained:
A split on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field.
... and first can be last and we have said something about the last field if its empty but nothing about the first field, so no problem here either (except for split(/\s+/, '', -1) which produces an empty list--but that's another issue and too documented in perlfunc a couple of paragraphs above: "Note that splitting an EXPR that evaluates to the empty string always returns the empty list, regardless of the LIMIT specified.").
I really believe that the magical disappearance of the leading empty field is documented enough to justify my suggestion. If one really really feel it's out of place to not mention this special case in the same sentence or paragraph (which would be a real pain if it always was done in the perldocs as Perl is full of special cases), just put a parenthesis that says "except for the special ' ' pattern; see below".
Not mentioning leading empty fields avoids the conflict of how to choose whether ('')[0] is a leading or trailing empty field and at the same time reduces complexity of the documentation.
ihb
In reply to Re^4: splitting nothing?
by ihb
in thread splitting nothing?
by bageler
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |