DreamT has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have this phrase: "word1,word2, word3,word4,word5"
, and want to split it in:
word1
word2, word3
word4
word5
I.e., i only want to split if the comma _isn't_ followed by whitespace. (if I split on comma followed by non-whitespace, I will also lose the first character in the word).

How do I do it?
  • Comment on Split on comma without whitespace after

Replies are listed 'Best First'.
Re: Split on comma without whitespace after
by Ratazong (Monsignor) on Nov 26, 2010 at 15:47 UTC
    my $str ="word1,word2, word3,word4,word5"; my @arr = split /,(?=\S)/, $str;

    The code above is a look-ahead; see perlre for more info (look at the chapter on "Look-Around Assertions" there).

    Rata
Re: Split on comma without whitespace after
by 7stud (Deacon) on Nov 26, 2010 at 23:50 UTC

    You don't need lookarounds. A word boundary (\b) will do the trick:

    use strict; use warnings; use 5.010; my $str ="word1,word2, word3,word4,word5"; my @pieces = split /\b,\b/, $str; for my $piece (@pieces) { say "-->$piece<---"; } output: -->word1<--- -->word2, word3<--- -->word4<--- -->word5<---

    A space isn't a word character, therefore it cannot be a word boundary when next to a comma.

Re: Split on comma without whitespace after
by ww (Archbishop) on Nov 26, 2010 at 17:42 UTC
    "...(if I split on comma followed by non-whitespace, I will also lose the first character in the word)."

    Literally true -- but only applicable if you think no further than splitting on a comma followed by a non-whitespace character; nonsense, otherwise. You need to think through the alternatives.

    • What would happen if you merely split on comma?
    • What if you state the regex in split as a negative?
    • Didn't you read the docs (and thereby find the methods recommended by previous responders)?