in reply to diff of two strings

chaos_cat

I tried your program against one corpus of documents and it works great for many sentences. However, I detected a logical problem.

considering a sentence of numbers, I get the following result (which is syntacticaly correct, but logicaly incorrect):

01 <02> [03] [04] [03] [05] [06] [07] [08] [09] [10] [11] <03> 12 13 14 15 12 16
01 ......... [04] [03] [05] [06] [07] [08] [09] [10] [11] [03] ........ 12 13 14 15 12 16

instead of:

01 <02> <03> 04 03 05 06 07 08 09 10 11 03 12 13 14 15 12 16
01 .................. 04 03 05 06 07 08 09 10 11 03 12 13 14 15 12 16

Is there a way to go around that problem? Thanks.