in reply to Writing unicode characters to file using open($fh, ">:utf8, $name) mangles unicode?
Note that my $iso_8859_1 = 'Österreich'; is usually only guaranteed to be iso-8859_1 encoded if you know that the source file is iso_8859_1 (instead of utf-8) encoded and/or you've not switched on "use utf8" somewhere. That can cause all kinds of interesting issues.
Also note that this is exactly the kind of thing you do NOT want to have to deal with. I'm tempted to say; just make a habit of use()ing utf8 and switch all your scripts to utf-8 encoding, or only use 7-bit ASCII in source files.
The only sane way to deal with unicode IO is to keep everything correctly flagged as being either in the "internal multibyte encoding" or binary/8-bit, use IO layers for input/output, use Encode::decode() to interpret binary strings directly if you have to, and never, ever use Encode::encode():
my $string = Encode::decode("iso-8859-1","\x{d6}sterreich"); # we want to write a utf-8 file open my $fh,">:utf8","/some/path" or die $!; print $fh $string; close $fh or die $!;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Writing unicode characters to file using open($fh, ">:utf8, $name) mangles unicode?
by telcontar (Beadle) on Aug 08, 2007 at 20:00 UTC | |
by Joost (Canon) on Aug 08, 2007 at 20:12 UTC | |
by telcontar (Beadle) on Aug 09, 2007 at 04:17 UTC |