in reply to reading/writing to a file

You're also reading every line of the dictionary file for every word, which is horribly inefficient. Given that most of the time spent is going to be in file I/O, it would be much better to read a large chunk of the dictionary file at once (if not all of it), and cycle through all the word / new word combinations at once, moving new words from the original array to a new array as a match is found. You'll only have to read the dictionary file once.

And there's no need to read and write at the same time. You can read first and then open it for append after and add all the new words in one print.

Depending on the number of words you're checking each run, you might do better just loading the entire dictionary file into memory as a hash and checking the new words that way. Perhaps you can have the script choose between the two methods depending on how many words there are to check?

Or you could even keep the dictionary file in alphabetic order, which will significantly cut down on the number of matches you have to do if you don't use a hash. New words would go into a second dictionary file, which would be unsorted and checked only if the first file didn''t match everything, and you could run a process every so often to merge the new words into the main dictionary file in alphabetic order (which can't be done every run since it would require rewriting most of the file).

Replies are listed 'Best First'.
Re^2: reading/writing to a file
by nnp (Initiate) on Jun 18, 2005 at 18:51 UTC
    I thought the only way to read in from a file was one line at a time or all at once? And the files are too big to read in all at once.