in reply to split on delimiter unless escaped

I'm not sure why you're promoting single delimiters into multiples of the same character. It seems to me that would make things more difficult rather than easier. You probably want to use something that's not in your data at all. I'm guessing you're wanting the '!' rather than the more usual '\' because you're wanting to avoid escaping your escape in Perl or because you're dealing with a data format that someone else already set.

Here's a quick poke at what you describe, although if you get any fancier than this you'd probably want to work on a real (if minimal) parser.

while ( <> ) { chomp; s/!;/\x001/g; s/;/\x000/g; s/\x001/;/g; s/!!/!/g; @a = split /\x000/; print '{' . (join '}{', @a) . "}\n"; } __END__ foo;bar;baz fred!;flintstone;barney!!rubble eggs!;spam!;toast!;spam;bacon!!;eggs!;spam!!toast!;spam;spamspam!!eggs +!!;spam

Which prints:

{foo}{bar}{baz} {fred;flintstone}{barney!rubble} {eggs;spam;toast;spam}{bacon!;eggs;spam!toast;spam}{spamspam!eggs!;spa +m}

Replies are listed 'Best First'.
Re^2: split on delimiter unless escaped
by ikegami (Patriarch) on Nov 09, 2010 at 22:50 UTC

    As you've demonstrated, it fails to split

    bacon!!;eggs

    Fix

    while (<>) { chomp; s/\x{00}/\x{00}0/g; s/!!/\x{00}1/g; s/!;/\x{00}2/g; my @a = split /;/; for (@a) { s/\x{00}2/;/g; s/\x{00}1/!/g; s/\x{00}0/\x{00}/g; } ... }

    I also fixed the inability to have char 00 in the data.