Since you are writing your output line by line, you don't need to store a matrix. You need a hash which maintains the number of times the word is found
in a given paragraph. Therefore,
- open the output file
- write the key line as an ordered set (array) with tab (\t) characters in appropriate places
- make a hash of all 100 keys with zero as the value
- then, for each paragraph,
- read from input file into a string until you find '-100'
- see if each character group (until space, newline, or punctuation) is a key
- if so, increment the hash value of that key. If not, ignore
- write out the values to the output file as an ordered set
- reset the hash values to zero
- close the files
You'll want to deal with capitalization somehow, as well. :D
Don Wilde
"There's more than one level to any answer."