in reply to Unicode problem
Note that ":raw" is sort of a synonym for "unix" in this context (that is, the last two examples behave the same when using "unix" instead of "raw").$ perl -e 'open($fh,">:crlf:encoding(UTF-16BE)", "test.utf16"); print +$fh "hello\n"' $ xxd test.utf16 0000000: 0068 0065 006c 006c 006f 000d 0a .h.e.l.l.o... # that was bad -- odd number of bytes $ perl -e 'open($fh,">:encoding(UTF-16BE):crlf", "test.utf16"); print +$fh "hello\n"' $ xxd test.utf16 0000000: 0068 0065 006c 006c 006f 000d 000a .h.e.l.l.o.... # that was good. $ perl -e 'open($fh,">:raw:encoding(UTF-16BE)", "test.utf16"); print $ +fh "hello\n"' $ xxd test.utf16 0000000: 0068 0065 006c 006c 006f 000a .h.e.l.l.o.. # also good (no CR, but who needs that anyway? ;) $ perl -e 'open($fh,">:encoding(UTF-16BE):raw", "test.utf16"); print $ +fh "hello\n"' $ xxd test.utf16 0000000: 6865 6c6c 6f0a hello. # not what you want
So, if you want a "standard" CRLF discipline for output, interacting correctly with UTF-16, there's only the one way to do that, it seems; OTOH, if you want unix-like LF discipline with UTF-16, there's a couple ways to get that (that is, you can say "unix" or "raw", but you still have to get the layers in the right order).
update: As almut wisely explains above, my examples are deficient -- each of the "working" cases should have the additional ":utf8" layer at the end, so that actual "wide characters" in the strings being printed will be interpreted and encoded correctly on output. Apologies for the confusion.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Unicode problem
by adrodin (Initiate) on Aug 21, 2007 at 06:33 UTC |