pffan239 has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to come up with a search interface that works similar to google where you can use query words (see: advanced operators reference) to let you specify specific sub fields in the data.

for example: if I was looking for a customer in ny named jones I could enter something like: "name: jones location: ny"

I'm interested in this approach because the number of potential fields to query can be numerous in my application. Most of the time only a few fields are used, but I'd like to have the power to get to any of the fields on tap if needed.

I've already got the data in a normalized mysql database. I'm confident that I can turn the query words into the SQL I need to get the information out of the database, but don't want to reinvent a parser if someone's already done that already. I know from experience that parsing is always a bunch harder than it looks on the surface.

I've scanned cpan for likely candidates, but can't find anything that seems to be close.

Can anyone point me in the right direction?

  • Comment on Module suggestions for parsing query words (ala google)?

Replies are listed 'Best First'.
Re: Module suggestions for parsing query words (ala google)?
by moritz (Cardinal) on Sep 30, 2010 at 21:07 UTC
    If you want all sorts of bells and whistles, including AND and OR and data fields, try KinoSearch::QueryParser.
    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Module suggestions for parsing query words (ala google)?
by BrowserUk (Patriarch) on Sep 30, 2010 at 20:34 UTC

    Ostensibly, your "parser" could be as simple as:

    $input = "name: jones location: ny";; %query = map{ split ':\s+' } $input =~ m[(\S+:\s+\S+)]g;; pp \%query;; { location => "ny", name => "jones" }

    Or just:

    $input = "name: jones location: ny";; %query = $input =~ m[(\S+):\s+(\S+)]g;; pp \%query;; { location => "ny", name => "jones" }

    Do you need more sophistication than that?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks for your reply!

      That definitely a good start. at first, I didn't think that regex's would good enough.

      I didn't have it in my original post, but I had in mind that the query terms could be multiple words. ie:

      name: mike jones location: ny

      or perhaps, more explicitly:

      name: "mike jones" location: ny

      regardless, my regex fu should be good enough to get that figured out.

      I'll give this approach a shot and see how far I get.

        Something like:

        @inputs = ( "name: mike jones location: ny", 'name: "mike jones" location: ny', "name: jones location: ny" );; $re = qr[ (?: (\S+): \s* ) ["']? (.+?) ["']? (?=\s+\S+:|$) ]x;; %h = m[$re]g and pp $_, \%h for @inputs;; ("name: mike jones location: ny", { location => "ny", name => "mike jo +nes" }) ("name: \"mike jones\" location: ny", { location => "ny", name => "mik +e jones" }) ("name: jones location: ny", { location => "ny", name => "jones" })

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Module suggestions for parsing query words (ala google)?
by LTjake (Prior) on Oct 01, 2010 at 15:14 UTC

      Thanks for the reply!

      This what I was looking for originally!

      I'll probably use the hand coded regex from BrowserUk above for my proof of concept prototype and use this module for the eventual real implementation.