I have a question:
How do we measure the overall similarity of two text files based on the word content and report the similarity in percentage. for example:
exactly the same, similarity should be 100%.
completely different (i.e. no single word in one file can be found in the other), similarity should be 0%.