in reply to Re^3: Help required in RE strategy
in thread Help required in RE strategy

This is not as expensive as you probably think. Assume
Compaq Presario Laptop Model No P440 L100 Series (PSLA0L-00U00E)
... as input ->
00 440 100 Compaq Laptop Model No Presario Series PSLA
Then we need todo something like this:
... # get vendor id: SELECT id FROM vendor WHERE NAME=00 or NAME=440 ... # check all words foreach (@word) { SELECT product_id FROM words WHERE vendor_id=? and word=? for my $id (@ids) $cnt{$id}++ } }
How you then analyze this match statistic depends on the data. But the editdistance is probably the wrong metric, you are right.

Replies are listed 'Best First'.
Re^5: Help required in RE strategy
by suaveant (Parson) on Aug 22, 2007 at 18:02 UTC
    Don't you want that first query to be an AND... assuming you can be relatively sure of all terms matching.

                    - Ant
                    - Some of my best work - (1 2 3)