in reply to Re^2: Unicode problem
in thread Unicode problem

I'd be really surprised if there was any real need or impact of final ":utf8" -- I think you can dispense with that.

The reason you need the final :utf8 is that the crlf layer is kinda turning off the UTF8-ness (or however you want to call it...). In other words, if you have a string containing non-ASCII characters (which was the reason for inventing Unicode in the first place, wasn't it :), you'd get nonsense, because the utf8 flag will either be ignored (on output), or not be set (on input). Of course, if you're only outputting an ASCII-only string like "hello", you won't see a difference...

For example, when replacing the "e" in "hello" with an "ä" (a-umlaut, U+00E4), you'd get correct output with

open my $fh, ">:raw:encoding(UTF-16LE):crlf:utf8", "ok.utf16" or die; print $fh "h\x{00e4}llo\n"; $ od -tx1 -An ok.utf16 68 00 e4 00 6c 00 6c 00 6f 00 0d 00 0a 00

but not with

open my $fh, ">:raw:encoding(UTF-16LE):crlf", "err.utf16" or die; print $fh "h\x{00e4}llo\n"; $ od -tx1 -An err.utf16 68 00 00 00 6c 00 6c 00 6f 00 0d 00 0a 00 ^^ wrong

accompanied by the warning when running the code:

Malformed UTF-8 character (unexpected non-continuation byte 0x6c, immediately after start byte 0xe4) in null operation at ...