in reply to Re: expanding the functionality of split
in thread expanding the functionality of split

Idiomatic as always. Is there still a fat comma in Perl 6? However the key point for me is that if you are going to split on a array of delimiters you often need to sort them by length first to get the behaviour you want. The ':' '::' example is a good case in point. If the order is ':', '::' you will never split on '::' as Perl will always do the ':' split and as a result return a number of (probably) unwanted null fields if we have any instances of '::' in the split string. This also holds true in the more usual case where you are doing a match or sub (on|a|range|of|odds|and|ends). If we used that order we would never match 'range' or 'and' as we always match the 'a' - unless we applied boundary conditions, etc....

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

  • Comment on Re: Re: expanding the functionality of split

Replies are listed 'Best First'.
Re: expanding the functionality of split
by Abigail-II (Bishop) on Dec 10, 2002 at 15:29 UTC
    But sorting on length will not do. A trivial and silly example, when the delimiters are abc and ab*c*, if you try ab*c* first, you'll never succeed on abc.

    You could make an ordering if you can decide whether one regex will match everything another does. I doubt this is a decidable question for Perl regular expressions. It is for "normal" regular expressions, and, IIRC, undecidable for context free grammars. Perl regular expressions are hard to qualify in this sense, but even if it's theoretical possible, it's not going to be cheap, and hence the price would be high.

    It's going to be a responsibility of the programmer to pass in the options in a logical order; just as already is required for alternations in regular expressions.

    Abigail

      It is alway beholden upon the programmer to consider the edge cases. This particular trap is one I have found myself in several time wondering how $var_x ended up with a null value. Fully 80% of the respondents to this node also tendered code that produced this result - albeit we were providing solutions to the wrong problem!. There is a recursive solution to the actual problem that I posted here - doubtless you will be able to demonstrate a failure edge case.....but you get that....

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print