I think that it does set undef for non-captured delimiters in a split regex that contains capturing parens. Look at this example:

my $string = qq/This&and+that/; my @segments = split /(&)|\+/, $string; print "$_\n" foreach @segments; __OUTPUT__ This & and Use of uninitialized value in (.) concatenation or string at mytest.pl + line 4. that

In that example, you can see that the non-capturing portion of the match results in undef being plopped into the list element pertaining to that portion of the split.

As for documentation, the POD for split says, "If the PATTERN contains parentheses, additional list elements are created from each matching substring in the delimiter."

This is correct. It appears to be true that additional elements are created for each matching substring in the delimiter if the PATTERN contains parenthesis. But what it doesn't tell you is that though elements are created for each matching substring, those elements are only populated with a value if the corresponding portion of the PATTERN also uses capturing parens. If the specific portion of PATTERN that matched isn't captured with parens, the element is still created (since parens were used somewhere else within PATTERN), but the element isn't populated.

In this case, I would consider this a bug, either in the documentation (for not documenting what happens if you combine both capturing and noncapturing components in the split PATTERN), or a bug in Perl's split, for not quite accomplishing DWIMery.


Dave


In reply to Re^4: split and capture some of the separators by davido
in thread split and capture some of the separators by shemp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.