in reply to Re^2: phrase match
in thread phrase match

All very good points AnomalousMonk!

>I'm not sure I understand why the poor sentence must be mauled so relentlessly in your final approach, but it's fine with me if it works for you.

I was just experimenting a few more things.

>I note that the approach you use does not seem to take account of longest versus shortest matches: 'tor SET6' can never match because 'tor' precedes it in the ordered alternation. Perhaps this is your intent, but be aware that as the code stands, longest-shortest matching behavior depends on the order in which phrases appear in the phrase list. (This is touched on in paragraph 5 of Re^3: phrase match.) See example below.

Very good point. Yes, my 'phrase list' would be in the decreasing order of phrase string length.

>I also note there is still no provision for a 'sentence' ending in a period, although again, perhaps this contingency will never arise. Example also below.

Yes, I will have the period removed in a preprocessing step.

Thanks a lot again.

Replies are listed 'Best First'.
Re^4: phrase match
by AnomalousMonk (Archbishop) on Dec 14, 2009 at 16:13 UTC
    I was just experimenting a few more things.

    Experimentation is good!

    ... my 'phrase list' would be in the decreasing order of phrase string length.

    If you do a  reverse sort on the phrase list or array the job is done, and you don't have to worry any more about order of insertion when adding new phrases. See example in Re^3: phrase match, para 5.