Re: Parsing, tokens and strings
by merlyn (Sage) on Oct 17, 2001 at 20:11 UTC
|
my @words = /("[^"]*"|'[^']*'|\S+)/g;
-- Randal L. Schwartz, Perl hacker | [reply] [d/l] |
|
|
According to my understanding of the original poster's goal, this is NOT the
solution to his problem. Contrary to the poster's statement that he
needs to make split "quote aware", I think he really wants to make
it quote UN-aware, so that he can treat quotes as another
delimiter character (this is what the sample assignment
statement he offered accomplishes, of the form @array = ( "A", "B" )
being the resulting effect, for an original line of the form A "B").
Unless I'm misunderstanding his intentions, the proper
solution would then be something akin to: my @words = split /[\s"']+/ (assuming it's not important to ensure balanced use of quotes)
Tim Maher
tim@consultix-inc.com
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
Oops! Now I see that the original poster's sample assignment was not as simple as I had shown, because he
had "B C" where I had "B" (multiple quoted words being treated as a single token, vs. my idea of a single word).
This being the case, my suggestion of changing split's delimiters will obviously not work, so "nevermind"! 8-}
Tim Maher
tim@consultix-inc.com
| [reply] [d/l] |
|
|
Incidently, for solutions like this, is there an easy way
of adding support for escaped quotes like \"?
-Ted
| [reply] [d/l] |
|
|
Yes, let someone else write and debug the code.
use Text::ParseWords;
my @words = shellwords($_);
{grin}
-- Randal L. Schwartz, Perl hacker | [reply] [d/l] |
|
|
| [reply] |
|
|
Hmm. Not as impressive when you realize I left the quote marks on. {grin} A simple fix:
my @words = grep defined, /"([^"]*)"|'([^']*)'|(\S+)/g;
-- Randal L. Schwartz, Perl hacker | [reply] [d/l] |
|
|
Heh, I thought the quotes being left in the @array was a "feature", so I:
foreach (@array) { s/"//g; s/'//g; }
Thanks all for your help,
JP Hindin,
-- Alexander Widdlemouse undid his bellybutton and his bum dropped off -- | [reply] [d/l] |
|
|
|
|
|
|
Yup, that'll be a talented regex hacker...
| [reply] |
Re: Parsing, tokens and strings
by Fletch (Bishop) on Oct 17, 2001 at 21:59 UTC
|
use Text::ParseWords qw( shellwords );
$line = q{Token1 Token2 "Little phrase" Token4};
@array = shellwords( $line );
print join( "\n", @array ), "\n";
And that has the advantage of Text::ParseWords
being core. If you don't mind fetching from CPAN,
c.f. Text::Balanced and Regexp::Common.
Update: Of course I just now noticed that merlyn
had mentioned Text::ParseWords above. /me should
read the whole thread more carefully.
| [reply] [d/l] |
Re: Parsing, tokens and strings
by petdance (Parson) on Oct 17, 2001 at 23:05 UTC
|
You can also take a look at Text::CSV_XS, which
has capabilities for setting what you want for field
containers, field separators, and record terminators.
xoxo,
Andy
--
<megaphone>
Throw down the gun and tiara and come out of the float!
</megaphone>
| [reply] |