Basewords are the words shown in the file 1 if i am comparing five files.. and words are the rest of the words in the other 4 files.
stopwords are the words shown in a stop list.
indeed, i think i did it wrong and thats why i can't finish it.. can you be able to solve it?? i need to find the similarity of a file comparing with the rest of the four files...(if there are 5 files in total) . and the similarity formula are the formula used to find similiarity.. its a fixed formula..
thanks