in reply to Re: reading dictionary file -> morphological analyser
in thread reading dictionary file -> morphological analyser
The earlier comment regarding spell/grammar checkers was spot-on. If you can find any information on how they function, it would probably be highly relevant to your problem.
For more general solutions, this seems to me like a database would be your best bet, whether a 'real' database (Postgres, MySQL, etc.) or just a tied/dbm hash.
If you really need to work directly off of a plain text file for some reason, you could index it to get at least some of the improvement that a database would bring: Sort the text file (it's probably already sorted, being a dictionary, but I mention it just to be sure) and then build a separate index file containing the offset in the dictionary for the first word beginning with each letter. By seeking to that position in the file before reading and processing lines and stopping when you hit a line that starts with a different letter, you can avoid searching through any words that start with the wrong letter, effectively reducing your dictionary size substantially.
|
|---|