poprishchin has asked for the wisdom of the Perl Monks concerning the following question:

User input from a form is stored in $keyword

I need to break up $keyword into 2 arrays:
- @quotes, containing any quoted substrings of $keyword
- @keys, containing any words seperated by spaces

For instance, if This is "the search" string "that was" supplied
- @quotes should contain the search and that was
- @keys should contain This and is and string and supplied

The code I have isn't working:
my @quotes = $keyword =~ s/"(.*)?"//; my @keys = split(/ /, $keyword);
Any suggestions? Also, if a user puts a space at the end of the string, I think things get goofed up more, and I should trim it.

Replies are listed 'Best First'.
Re: search string regex
by bobf (Monsignor) on Oct 29, 2004 at 23:20 UTC

    You're very close. Here is one way to do it:

    use strict; use warnings; my $string = 'This is "the search" string "that was" supplied'; my @quotes; while( $string =~ s["(.*?)"][] ) { push( @quotes, $1 ); } my @keys = split( ' ', $string ); use Data::Dumper; print "Quotes:\n", Dumper( \@quotes ); print "Keys:\n", Dumper( \@keys ); output: Quotes: $VAR1 = [ 'the search', 'that was' ]; Keys: $VAR1 = [ 'This', 'is', 'string', 'supplied' ];
    I moved the '?' inside the parens so it modifies the '*'. Note also that s/// returns the number of substitutions made (see perlop), not the text that was substituted (or captured in $1, in this case), hence the while loop to walk through the string.

    I'm sure someone else will come up with a more clever way of doing it, perhaps with one of the Text modules, but this works with the data you supplied. If you need to handle escaped quotes, definitely consider one of the modules.

    HTH

Re: search string regex
by pg (Canon) on Oct 29, 2004 at 23:19 UTC

    It is much more clean to use Text::ParseWords than regexp. For example:

    use Text::ParseWords; $_ ='1111 "222 222" 333 4444 55 666 77777'; print join(",", (quotewords('\s+', 0, $_))[0,1,2,4,5,6]);

    Then do whatever you want on top of this.

Re: search string regex
by bart (Canon) on Oct 30, 2004 at 01:13 UTC
Re: search string regex
by Anonymous Monk on Oct 30, 2004 at 01:41 UTC
    I think you can use your original design. Please do not match quotes in quotes.
    my @quotes = $keyword =~ s/"([^"]*)?"//; my @keys = split(/ /, $keyword);
    If you think about spaces at the end or two spaces in the middle modify your code like
    my @keys =grep length, split(/ /, $keyword);
    or
    my @keys = split(" ", $keyword);
Re: search string regex
by TedPride (Priest) on Oct 30, 2004 at 02:17 UTC
    bobf: My solution turned out almost identical to yours. However, you will notice that matches and substitutions are done all at once, rather than one at a time. Benchmarked for a million iterations, I got 19 seconds vs 22 for yours.
    $_ = 'This is "the search" string "that was" supplied'; my @quotes = (m/"(.*?)"/g); s/".*?"//g; my @keys = split();
    bart: Benchmarked for the same million iterations, your solution takes 35 seconds. EDIT: Benchmarking methodology is shown below:
    use strict; use warnings; my $time = time(); for (1..1000000) { $_ = 'This is "the search" string "that was" supplied'; my @quotes = (m/"(.*?)"/g); s/".*?"//g; my @keys = split(); } print time() - $time;
    Only the mechanical portions of each algorithm were tested. The results can be easily duplicated with cut and paste.