in reply to Finding dictionary words in a string.
In particular, there needs to be some sort of trade-off between the two scoring criteria: "strings that are composed entirely of words get a better score" is somewhat at odds with "strings with the fewest number of words will get a higher score". Are you talking about solutions that involve just non-overlapping words? If so, then you would need to compare various possible parses of the input string. How would you score a string like "labsolve"? Does "lab, solve" score better or worse than "absolve, one letter left unused"?
In either case, it might be helpful to have your dictionary stored as an array of hashes; the index into the array is the length of words in that hash. (The hashes themselves could be sub-indexed by first letter, perhaps, or first two or three letters, as suggested by the AM in the first reply.) This way, you know before starting on the input string what the longest substring is that you need to match against, and for each substring that you test, you only need to compare it to a limited number of dictionary entries.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Finding dictionary words in a string.
by ehdonhon (Curate) on Mar 13, 2004 at 18:53 UTC | |
by tachyon (Chancellor) on Mar 13, 2004 at 19:57 UTC |