What interests me from a 'why is it so?' point of view is that if \s+ splits at the begining of a string to find the NULL why not at the END as well. Your example documants the behaviour but why is leading whitespace treated differently from trailing whitespace? There is afterall a null string after the trailing whitespace as well.
The defaults of split are to ignore trailing empty fields, and to keep leading empty fields. Why these are the defaults, I can only speculate. Leaving off trailing empty fields is relatively harmless, an empty string is false, and so is a non-existing array element. But in many cases, leaving off empty leading fields only brings havoc. Suppose you have some tabulated process data: controlling terminal, PID, UID, process name, arguments. Some processes don't have arguments, and some don't have a controlling terminal. If you leave off the empty arguments fields, there's no harm. But if you leave off the empty controlling terminal field, in the resulting list, the PID is suddenly in position 0, not position 1.

As for split ' ' leaving off leading empty fields, this is the exception, and specifically done to simulate the behaviour of AWK.

Abigail


In reply to Re: reg ex help by Abigail-II
in thread reg ex help by markd

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.