in reply to fill diacritic into text

You want to use something like DBFile (or BerkeleyDB, or DBI/DBD::SQLite) to make a hash-on-disk out of the lookup file.

Replies are listed 'Best First'.
Re^2: fill diacritic into text
by Grundle (Scribe) on May 30, 2007 at 17:34 UTC
    This is also what I would suggest. You need some sort of Database conversation so that you do not have to load everything in memory. Then all you need to do is write logic around the "input" word in some sort of query.

    SELECT max(fl.frequency), fl.replace_word FROM freqency_list fl INNER JOIN synonyms syn ON syn.wordid = fl.worid AND syn.word = $myword

    This of course means you would have to build your database tables so that you could make use of this data. The query I just wrote is an example of what you COULD do if you had the data loaded into a database.

    One thing I do not understand about your exercise is how you are deciding which word is synonymatic with another word. How do you relate a word + frequency count to a word in your text?

    There has to be a relationship established between the words, otherwise you have a meaningless list...
A reply falls below the community's threshold of quality. You may see it by logging in.