in reply to Re: Frequency of words in text file and hashes
in thread Frequency of words in text file and hashes

Hello Ted,
thanks for the code and your ideas. I am actually working with text in a font for another language
(not English), so the tokenisation and translation code does not work for me, but of course works
great with your test data in English. But anyway, got the idea.

In this font, a single letter(vowel/consonant), may be mapped to two or more ascii characters, e.g.
letter 1 in my font = ascii chars sd letter 2 in my font = ascii chars !#
Though of course we can do the frequency count of words from %count, and also find the no of unique words from it;we are still building %count from an array into which the words have been pushed.
Right now this works, but if the array were to hold a huge number of words, say 1 million,would this not be a problem? Is there a way around this?
Thanks,
perl_seeker:)