Thai and Lao text ... these languages, sentences are generally delimited by whitespace, and individual words are not delimited at all in the text, but instead are delimited by syntactic rules.
So, fair to say that the first requirement to process Unicode 'text'; is to determine the language.
So then the question becomes: given a file of Unicode text; can the language be determined?
In reply to Re^2: Perl & Unicode: state of the art?
by BrowserUk
in thread Perl & Unicode: state of the art?
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |