in reply to Re^2: Sorting Vietnamese text
in thread Sorting Vietnamese text
It's still missing a correct 'secondary sort' (for the edge case when the diacritic-stripped words are identical);
(laughs) That's hardly an "edge case" in Vietnamese - there are thousands of minimal pairs where the only difference between the words is the diacritical marks. While it's possible to read and understand Vietnamese typed in (7-bit) ASCII without too much ambiguity (i.e. you can figure out what word is meant from the context), this obviously wouldn't work for a dictionary.
The other issue is that the words in the dictionary need to be sorted in the "correct" order for me to detect duplicates, etc.
I'll try out your suggestion later today - thanks again!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Sorting Vietnamese text
by Anonymous Monk on Dec 23, 2013 at 18:47 UTC | |
|
Re^4: Sorting Vietnamese text
by Anonymous Monk on Dec 23, 2013 at 19:04 UTC |