in reply to Unicode problem

I suspect you are using a windows system, and Errto is on the right track: you need to take proper control of the "native crlf" behavior for that OS. Note that ordering of the PerlIO layers can be significant. I don't have a windows system to test on myself, but my bsd-based macosx shows the following behaviors with the various permutations - YMMV, but I think you'll see something like this:
$ perl -e 'open($fh,">:crlf:encoding(UTF-16BE)", "test.utf16"); print +$fh "hello\n"' $ xxd test.utf16 0000000: 0068 0065 006c 006c 006f 000d 0a .h.e.l.l.o... # that was bad -- odd number of bytes $ perl -e 'open($fh,">:encoding(UTF-16BE):crlf", "test.utf16"); print +$fh "hello\n"' $ xxd test.utf16 0000000: 0068 0065 006c 006c 006f 000d 000a .h.e.l.l.o.... # that was good. $ perl -e 'open($fh,">:raw:encoding(UTF-16BE)", "test.utf16"); print $ +fh "hello\n"' $ xxd test.utf16 0000000: 0068 0065 006c 006c 006f 000a .h.e.l.l.o.. # also good (no CR, but who needs that anyway? ;) $ perl -e 'open($fh,">:encoding(UTF-16BE):raw", "test.utf16"); print $ +fh "hello\n"' $ xxd test.utf16 0000000: 6865 6c6c 6f0a hello. # not what you want
Note that ":raw" is sort of a synonym for "unix" in this context (that is, the last two examples behave the same when using "unix" instead of "raw").

So, if you want a "standard" CRLF discipline for output, interacting correctly with UTF-16, there's only the one way to do that, it seems; OTOH, if you want unix-like LF discipline with UTF-16, there's a couple ways to get that (that is, you can say "unix" or "raw", but you still have to get the layers in the right order).

update: As almut wisely explains above, my examples are deficient -- each of the "working" cases should have the additional ":utf8" layer at the end, so that actual "wide characters" in the strings being printed will be interpreted and encoded correctly on output. Apologies for the confusion.

Replies are listed 'Best First'.
Re^2: Unicode problem
by adrodin (Initiate) on Aug 21, 2007 at 06:33 UTC