in reply to Does Perl support unicode-16?

Ah, the confusions surrounding Unicode. For something given a name that means 'one code' there sure are a lot of different ways to specify it...

UTF-16 is not a 'larger character set' than UTF-8.

UTF-16 is an 'encoding', a method of storing characters in memory; it encodes most (virtually all) characters in 16 bits. Windows NT Unicode strings are UTF-16 encoded.

UTF-8 is another encoding, and the one Perl uses internally. It encodes all of the original 7-bit ASCII characters as a single byte, identically to the way they are encoded in ANSI.

If you have an application that's expecting UTF-16, you'll want to use the Encode module (which I believe is core, in 5.8 at least) to turn your string into one that Perl will emit as UTF-16:

use Encode; my ($unicode_string, $utf16_string); $unicode_string = get_a_unicode_string(); # ^^ this string is a character string internally stored # as UTF-8 $utf16_string = encode('utf16', $unicode_string); # ^^ this string is an 'octet' (byte) string internally # stored as bytes. Each character of the string is stored in # two bytes of $utf_string. # (Also note the presence of a UTF-16 BOM) function_expecting_utf16($utf16_string);

Update:

(Thanks, ytsh)
--Stevie-O
$"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc