Massyn has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

Regular expressions are simply the most wonderful thing on the planet. Problem is that the regular user out on the tubes don't understand what \w, \b, let alone .+ means. What most of them do understand is how to pop stuff into Google and get some sort of a response.

They've got the + and - operators, which are good, using AND and OR, etc. so your basic boolean logic, which is very user friendly, and quite effective.

So how do I convert those arbitrary search parameters into regular expressions? I'll start coding something that will split on the AND and OR, and start massaging a regular expression with pipes and brackets, and I may end up with something that may or may not work very close to how Google would parse their inputs.

Any thoughts from your side? CPAN did not reveal any obvious module that could do this for me. I'll see how it goes, and share the code for public scrutiny when I'm done.

Thank you kind monks, have a blessed day of Perl Mediation.

Hail Larry...

Massyn

Replies are listed 'Best First'.
Re: Convert "google" to regex?
by moritz (Cardinal) on Feb 22, 2010 at 21:04 UTC
    If you want to offer your customers a search engine, use a search engine.

    In particular I'm fond of KinoSearch, especially the current series of developer releases add very useful features. If you want an alternative, look at Plucene.

    They build an index, and do things like stemming (so that if you search for Mooses you will also find documents containing onl Moose).

    If you really want to turn a search string into a regex, you could at least use the query parser from such a search engine to do the analysis of the search string for you.

Re: Convert "google" to regex?
by ikegami (Patriarch) on Feb 22, 2010 at 21:05 UTC