Re: splitting nothing?

Replies are listed 'Best First'.
Re^2: splitting nothing? by bageler (Hermit) on Jul 13, 2004 at 23:53 UTC
perhaps "If there are zero non-empty matches, all are treated as empty trailing fields and are deleted."	[reply]
Re^3: splitting nothing? by ysth (Canon) on Jul 14, 2004 at 00:04 UTC
How does this look: --- perlfunc.pod.orig 2004-06-01 05:37:39.000000000 -0700 +++ perlfunc.pod 2004-07-13 17:02:48.436164800 -0700 @@ -4986,7 +4986,7 @@ Splits the string EXPR into a list of strings and returns that list. + By default, empty leading fields are preserved, and empty trailing ones +are -deleted. +deleted. (If all fields are empty, they are considered to be trailin +g.) In scalar context, returns the number of fields found and splits into the C<@_> array. Use of split in scalar context is deprecated, howev +er, [download]	[reply] [d/l]
Re^4: splitting nothing? by bageler (Hermit) on Jul 14, 2004 at 15:59 UTC
sounds clear to me :)	[reply]
Re^2: splitting nothing? by ihb (Deacon) on Jul 16, 2004 at 21:44 UTC
Why even mention empty leading fields? My suggestion is to change Splits the string EXPR into a list of strings and returns that list. By default, empty leading fields are preserved, and empty trailing ones are deleted. to Splits the string EXPR into a list of strings and returns that list. By default, empty trailing fields are deleted. Update: Suggested patch: (two changes) `4875,4876c4875 < default, empty leading fields are preserved, and empty trailing ones + are < deleted. --- > default, empty trailing fields are deleted. 4953c4952 < whitespace produces a null first field. A C<split> with no argument +s --- > whitespace may produce a null first field. A C<split> with no argum +ents` [download] See Re^4: splitting nothing? for motivation. `ihb`	[reply] [d/l] [select]
Re^3: splitting nothing? by ysth (Canon) on Jul 17, 2004 at 01:44 UTC
Because that's only the default. split " " (but not split / /) doesn't preserve leading empty fields.	[reply]
Re^4: splitting nothing? by ihb (Deacon) on Jul 17, 2004 at 19:49 UTC
I was anticipating this very answer, but didn't want to clobber my first post and hoped I wouldn't have to write this reply. `:-)` Short version: It's unnecessary to mention leading empty fields in that paragraph as default behaviour because where this sentence currently stands, there's no specification on how split() works--just what it returns, ~~there's only one case that doesn't produce an otherwise expected empty leading field (singular),~~ there's a conflict of whether a list with only empty fields holds leading or trailing empty fields as they can't be considered both in this case, ~~the only case~~ those cases that doesn't produce an expected leading empty field (singular) is well documented and is already written in a way that doesn't conflict with trailing empty fields, and it reduces complexity of the documentation without losing any information. See Re^2: splitting nothing? for a suggested documentation patch. Update: This doesn't change that clarifications on how split() works shouldn't be done. I'm just arguing that adding yet another rule to how it works isn't the way to go and by removing the sentence in question we actually make the documentation of split clearer. The really long version for the particularly interested: As I see it, there are at least two ways to solve this. One way is that we do as the patch at Re^3: splitting nothing? does and introduce yet more complexity by saying that empty leading fields that also are empty trailing fields aren't empty leading fields but empty trailing fields. Another is to attack the problem at the root and not confuse the reader with leading empty fields at all. You can tell split() to not ignore trailing empty fields. However, you cannot tell split() to disregard leading empty fields in the general case--it's only done for a particular case (if one choose to look at it as removal of empty fields rather than skipping of leading whitespaces--see below). For me, it's more confusing to say that it's a default behaviour instead of just documenting the special case. This "undefault" behaviour is already explained in the documentation: If PATTERN is also omitted, splits on whitespace (after skipping any leading whitespace). As we see, the documentation already resolves this issue by saying that for this special case the leading whitespaces are skipped rather than first splitting on them and then removing the resulting empty leading field. (My english isn't good enough to judge whether the documentation should put whitespace in plural or singular and if the documentation can be interpreted to split on `/\s/` rather than `/\s+/`.) `split;` is equivalent to `do { split /\s+/, /\s(.)/s && $1 }` for defined values of `$_`. The `/\s+/` pattern would at most produce one leading empty field which makes it excessive and confusing to talk about leading empty fields in pluralis. This is further explained: A split on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field. ... and first can be last and we have said something about the last field if its empty but nothing about the first field, so no problem here either (except for `split(/\s+/, '', -1)` which produces an empty list--but that's another issue and too documented in perlfunc a couple of paragraphs above: "Note that splitting an EXPR that evaluates to the empty string always returns the empty list, regardless of the LIMIT specified."). I really believe that the magical disappearance of the leading empty field is documented enough to justify my suggestion. If one really really feel it's out of place to not mention this special case in the same sentence or paragraph (which would be a real pain if it always was done in the perldocs as Perl is full of special cases), just put a parenthesis that says "except for the special `' '` pattern; see below". Not mentioning leading empty fields avoids the conflict of how to choose whether `('')[0]` is a leading or trailing empty field and at the same time reduces complexity of the documentation. `ihb`	[reply] [d/l] [select]
Re^5: splitting nothing? by ysth (Canon) on Jul 18, 2004 at 22:19 UTC
Re^6: splitting nothing? by ihb (Deacon) on Jul 19, 2004 at 11:17 UTC