TASdvlper has asked for the wisdom of the Perl Monks concerning the following question:

All,

I'm a little stumped. I have the following string:

my $str = "CONFIGURATION,Validate C++ CLI Common Syntax (-h, -m, -np, +-p, -t, -q, -v, -ncr, CMD) via ndu -status on CX700,test-dvlp,yes"
And I want to split on the comma but I don't want to split on the comma if there is a space(s) after it. So, basically, I want the elements in my array to be:

element 0: CONFIGURATION
element 1: Validate C++ CLI Common Syntax (-h, -m, -np, -p, -t, -q, -v, -ncr, CMD) via ndu -status on CX700
element 2: test-dvlp
element 3: yes

Is there some way of combining a regexp with the split command ?

Any help would be greatly appreciated.

Replies are listed 'Best First'.
Re: A question on splitting
by ysth (Canon) on Jan 14, 2004 at 20:40 UTC
    split always uses a regex (except for the special case of a single blank). So you want: split /,(?! )/, $str
      So, let me see if I can understand your regexg.

      Split on "," if (0 or 1 occurances) (?) and if there is NOT (!) a space after ( ).

      So, if I wanted to check for 1 or more spaces, would this be correct ?

      split /,(?! +)/, $str
        There's no need to check for more than one space. If a comma is followed by say 5 spaces, it's also followed by a space. There's only one character relevant here, and that's the character right after the comma.

        Abigail

        Not quite. (?!pattern) is a special assertion that says the regex should only succeed if pattern would not match at that point (see perlre). So the regex has two parts: , (comma) and (?! ) (not followed by a space).

        Note that (?! ) (or it's positive counterpart (?= )) do not consume any part of the string. So /a(?=0)\d{2}/ applied to "a012" will match just "a01".

        Changing it to have a + doesn't affect anything, since if it has more than one space after the comma, it definitely has one space.

Re: A question on splitting
by mpolo (Chaplain) on Jan 14, 2004 at 20:44 UTC

    This is almost certainly the roundabout way of doing this, but you should be able to achieve your result by replacing (s///) the commas without spaces after them with something that does not occur in your dataset, like a tilde ~, for instance. Then you can split on the tilde.

    Update: While this would work, it is really a silly answer. As another poster has indicated, there is a reason why you put your split argument inside of //, and that's that it's a regular expression. (I think that using quotes suppresses regular expression processing in split, though.)

      I think that using quotes suppresses regular expression processing in split, though.)
      Nope:
      $ perl -wle'print for split "b+", "abbc"' a c
      Which is why you should always use // (or m:: or whatever). Without that habit, it's easier to say things like split "." (which devours all input except newlines :) or split "?" (which dies, since that's not a valid regex) by accident.

      Though, as I said, split ' ' is an exception.

      I think that using quotes suppresses regular expression processing in split, though

      Nope. The first argument to split is always a regular expression except for the one special case of " "