in reply to Re: expanding the functionality of split
in thread expanding the functionality of split

This will actually fail because it will split on ':' before '::' and thus return unwanted null fields. You need to do /::|:|\s+/. See below....

$string = "a:b::c d"; @fields = split(/:|::|\s+/, $string); print "Got '$_'\n" for @fields; __DATA__ Got 'a' Got 'b' Got '' Got 'c' Got 'd'

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Replies are listed 'Best First'.
Re: Re: Re: expanding the functionality of split
by runrig (Abbot) on Dec 10, 2002 at 00:20 UTC
    You need to do /::|:|\s+/.

    And you could shorten that to just /::?|\s+/

      Maybe I did not explain what I meant properly. I don't want to match ':' OR '::' OR ..., I want to match ':' THEN '::' THEN... I want to effectively change what the split operator uses as the splitting pattern mid-split. After the first field is found using the first delimeter pattern, it is discarded and the next field can only be delimited with the next pattern. The patterns move from left to right and MUST match in the order that they are presented in.

      tigervamp

        That is no a split then, that is a pattern match. The problem remains that a line like 'a::b:: c' will match according to your rules but the result will be 'a', ':b', 'c'. You can do this easily with recursion:

        my @split_bits = qw( : :: \s+ ); my @result = wierd_split( 'foo:bar::baz boo:foo::bar::baz', \@split_b +its ); print "Got $_\n" for @result; sub wierd_split { my ( $str, $to_do, $got ) = @_; return (@$got, $str) unless @$to_do; my $next_split = shift @$to_do; print scalar @$to_do, " splitting '$str' on '$next_split'", "\n"; my ( $want, $next ) = $str =~ m/(.*?)$next_split(.*)/m; print "'$want' '$next'\n"; push @{$got}, $want; # and now for a little recursion wierd_split( $next, $to_do, $got ) if $want; } __DATA__ 2 splitting 'foo:bar::baz boo:foo::bar::baz' on ':' 'foo' 'bar::baz boo:foo::bar::baz' 1 splitting 'bar::baz boo:foo::bar::baz' on '::' 'bar' 'baz boo:foo::bar::baz' 0 splitting 'baz boo:foo::bar::baz' on '\s+' 'baz' 'boo:foo::bar::baz' Got foo Got bar Got baz Got boo:foo::bar::baz

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print