in reply to Re: Perl Modules for handling Non English text
in thread Perl Modules for handling Non English text
That's not enough for all languages and hence "wide characters", or 16 bit ones.
Perl's wide chars are 32-bit or 64-bit depending on the build, not 16.
fmdev10$ perl -le'print ord "\x{FFFFFFFF}"' 4294967295
persephone$ perl -le'print ord "\x{FFFFFFFFFFFFFFFF}"' 18446744073709551615
Unicode currently requires 17 bits.
is each byte a character or is two bytes a character?
Or something else entirely, as in the following popular encodings: UTF-8 (1-4 bytes per char currently, 1-6 possible), UTF-16le/UTF-16be (2 or 4 bytes per char).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Perl Modules for handling Non English text
by Marshall (Canon) on Mar 31, 2009 at 01:51 UTC | |
by ikegami (Patriarch) on Mar 31, 2009 at 02:26 UTC | |
by Marshall (Canon) on Mar 31, 2009 at 04:15 UTC | |
by ikegami (Patriarch) on Mar 31, 2009 at 04:53 UTC | |
by Marshall (Canon) on Mar 31, 2009 at 05:08 UTC | |
|