However, one construct is for string literals. Something like 8"foobar" or U"tweedle" could have spaces within the quotes, and the token is properly delimited by the closing quote, not the whitespace.
I don't want to have to go to a full-blown fancy parser just to handle this one little case. I think a two-pass system could do it, the first pass noticing the quotes and escaping out any spaces inside them. But that seems in-elegant. Any regex wizards care to tackle this?
Two ideas: split is told what the delimiter is, as opposed to what to keep (as with a m//g). Using advanced regex features, tell it to reject space if it's in the middle of a quotation. Using @list=m/blah/g instead of split is different, and might be more straightforward by some ways of thinking about it, because it doesn't need to look outside of the area it's working on.
But, what about two distinct regular expressions sharing the same current position? See which matches the current spot, and immediatly know what to do with it rather than having to figure it out again. I thought I saw something about that once... the current position is part of the string, not part of the regex. But doesn't the regex instance also keeping track of something?
In reply to Not quite a simple split by John M. Dlugosz
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |