in reply to Parsing a line of text items
A Text::CSV (or Text::CSV_XS for speed) solution seems very appropriate, but if you need to roll your own, maybe something like:
Win8 Strawberry 5.30.3.1 (64) Tue 03/30/2021 11:53:39 C:\@Work\Perl\monks >perl -Mstrict -Mwarnings use 5.010; # needs (?|...) branch reset my $rx_dq_body = qr{ [^\\"]* (?: \\. [^\\"]* )* }xms; my $rx_unquoted = qr{ \S+ }xms; for my $args ( '', ' ', '23 45.67 "John Marcus O\"Ddly" Surname', '"only \"quoted\" thing"', 'no quoted stuff', ) { my $got_parsed_args = my @parsed_args = $args =~ m{ \G \s* (?| " ($rx_dq_body) " | ($rx_unquoted)) }xmsg; print ">$args< -> "; if ($got_parsed_args) { printf "%s \n", join ' ', map ">$_<", @parsed_args; } else { print "nada \n"; } } ^Z >< -> nada > < -> nada >23 45.67 "John Marcus O\"Ddly" Surname< -> >23< >45.67< >John Marcus +O\"Ddly< >Surname< >"only \"quoted\" thing"< -> >only \"quoted\" thing< >no quoted stuff< -> >no< >quoted< >stuff<
This needs Perl version 5.10+ for the (?|...) "branch reset" operator, but modification for pre-5.10 Perls is simple; let me know if you need it. The $rx_dq_body regex to match a double-quoted body supports embedded escaped double-quotes (and any other escaped character). You can play with this regex to get exactly what you want/need.
Of course, lots of tests should be done to verify this (or any other solution) really does what you want.
Update: For some reason, I included a \G \s* group in
the regex above. It is entirely unnecessary although it does no harm
AFAICT. The match regex
m{ (?| " ($rx_dq_body) " | ($rx_unquoted)) }xmsg
should be exactly equivalent.
Give a man a fish: <%-{-{-{-<
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Parsing a line of text items
by LanX (Saint) on Mar 30, 2021 at 18:09 UTC | |
by AnomalousMonk (Archbishop) on Apr 01, 2021 at 07:01 UTC | |
by LanX (Saint) on Apr 01, 2021 at 10:08 UTC |