in reply to Re^3: diff of two strings
in thread diff of two strings
Excelent! I will try it later today on a big corpus of documents to see if I can spot any exceptions. I will come back with feed-back.
multiple occurence:
If the sentence contains two or more consecutive identical words, it doesn't matter which one is marked "new" and which one is marked "moved".
real world problem:
This program is meant to help detect the change in semantic of a corpus of similar documents.
Since word order and new words are the first candidates for a semantic modification I need such a program to detect them and put them in paralel.
|
|---|