Don't ask to ask, just ask | |
PerlMonks |
Re^3: Perl & Unicode: state of the art?by LanX (Saint) |
on Oct 08, 2013 at 00:45 UTC ( [id://1057333]=note: print w/replies, xml ) | Need Help?? |
> can the language be determined? You know the answer, only with statistical certainty and dependent on the length of the text and the distance of languages. Hand and finger (en) <=> Hand und Finger (de) If same script lead to same delimiters can only be answered by someone knowing all 6000 languages of the world. But already Arabic words should be a problem, maybe less if transcribed. Chinese even more. see also Word_divider and Word#Word_boundaries
Cheers Rolf ( addicted to the Perl Programming Language)
In Section
Seekers of Perl Wisdom
|
|