htmanning has asked for the wisdom of the Perl Monks concerning the following question:

I'm using perl to search a database. It works fine with "and" and "or" searches. I'm trying to get it to work like Google in that if I put quotes around the string, the search has to return that exact string. I'm currently using the following:
$searchtextterms = $in{'searchtextterms'}; $origterms = $searchtextterms; $escterms = $searchtextterms; $escterms =~ s/ /%20/g; $searchtextterms =~ s/\'/\\'/g; $andor = $in{'andor'}; $andor = "AND" if ($andor eq ""); # break searchtextterms on white space to do AND or OR strings @words = split (/ /, $searchtextterms); $num_words = @words; $terms_string = "(title Like '%$words[0]%' OR description Like '%$word +s[0]%' OR organization Like '%$words[0]%' OR type Like '%$words[0]%') +"; for ($n=1; $n<=$num_words-1; $n++) { $terms_string = $terms_string . " $andor (title Like '%$wo +rds[0]%' OR description Like '%$words[0]%' OR organization Like '%$wo +rds[0]%' OR type Like '%$words%')" if ($words[$n] ne ""); } $SQL = "SELECT num FROM database WHERE ($terms_string)";
I just need to know how to do an if $searchtextterms are between quotes, do a query on the whole phrase. Thanks.

Replies are listed 'Best First'.
Re: Help with search query
by Anonymous Monk on Mar 27, 2012 at 06:30 UTC
Re: Help with search query
by kcott (Archbishop) on Mar 27, 2012 at 08:10 UTC

    I thought Text::Balanced might be an appropriate module to use here but I haven't used it before. The following test indicates suitablility for your application; other monks (who might have used this module more than once) may have better ways of doing this.

    #!/usr/bin/env perl use 5.010; use strict; use warnings; use Text::Balanced qw{extract_multiple gen_delimited_pat}; my $in_string = q{qwe asd 'rty uio p' zxc "vbnm fghj" 123 4 ' @ & '}; my @tokens = extract_multiple($in_string => [ q{ }, gen_delimited_pat(q{'"}) ]); for my $token (@tokens) { next if $token eq q{ }; $token =~ y{'"}{}d; say $token; }

    Output:

    qwe asd rty uio p zxc vbnm fghj 123 4 @ &

    -- Ken

Re: Help with search query
by JavaFan (Canon) on Mar 27, 2012 at 09:28 UTC
    Perhaps you want something like (untested):
    my @parts = $in{'searchtextterms'} =~ /(?|"([^"]*)"|'([^']*)'|(\S+))/g +;
    This doesn't allow for escaping the quotes. It also may not do what you want it to do on input like:
    foo'bar
    This will break your SQL queries as presented, as you're interpolating and not using place holders. In fact, the interpolating may allow for SQL injection attacks, but your input may be trusted, or you may do your scrubbing elsewhere; I cannot judge that from your post.

      Search::QueryParser supports quoted constructs :) But if you want your regex see Regexp::Grammars::Common::String or Regexp::Common

      use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( qr/(?:(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")|(?:\')(?:[^\\\']*(?:\\. +[^\\\']*)*)(?:\')|(?:\`)(?:[^\\\`]*(?:\\.[^\\\`]*)*)(?:\`))/ )->explain; __END__ The regular expression: (?-imsx:(?:(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")|(?:\')(?:[^\\\']*( +?:\\.[^\\\']*)*)(?:\')|(?:\`)(?:[^\\\`]*(?:\\.[^\\\`]*)*)(?:\`))) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- \" '"' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- [^\\\"]* any character except: '\\', '\"' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- (?: group, but do not capture (0 or more times (matching the most amount possible)): ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- . any character except \n ---------------------------------------------------------------------- [^\\\"]* any character except: '\\', '\"' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- )* end of grouping ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- \" '"' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- \' ''' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- [^\\\']* any character except: '\\', '\'' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- (?: group, but do not capture (0 or more times (matching the most amount possible)): ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- . any character except \n ---------------------------------------------------------------------- [^\\\']* any character except: '\\', '\'' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- )* end of grouping ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- \' ''' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- \` '`' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- [^\\\`]* any character except: '\\', '\`' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- (?: group, but do not capture (0 or more times (matching the most amount possible)): ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- . any character except \n ---------------------------------------------------------------------- [^\\\`]* any character except: '\\', '\`' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- )* end of grouping ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- \` '`' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------