Dear monks,

I am trying to write unicode, specifically utf-8, to a file; as the data exists in iso-8859-1 (or another character set) it must be converted first.
I then write an utf-8 string to a file, and after reading the docs I thought I must open the file for writing using open($fh, '>:utf8', $filename), however when I do this and look at the file in any unicode-capable editor I see garbage. If I write the file normally, using open($fh, '>', $filename) all seems well. As this contradicts perluniintro, which clearly states one should use the former open() method, or even use use open ':utf8' when dealing with files, I am sure I must be doing something wrong.

The following code is meant to illustrate my problem. The files '_original' and '_decoded' are the same and I do not find this surprising. The file '_utf8' does not display the characters correctly, unless I change the code to write_to_file('>', '_utf8', $utf8);.
use Encode; sub write_to_file { my ($mode, $filename, $what) = @_; open (my $fh, $mode, $filename) or die "Couldn't open $filename for writing: $@"; print $fh $what; close $fh; } my $iso_8859_1 = 'Österreich'; my $string = Encode::decode('iso-8859-1', 'Österreich'); my $utf8 = Encode::encode_utf8($string); write_to_file('>', '_original', $iso_8859_1); write_to_file('>', '_decoded', $string); write_to_file('>:utf8', '_utf8', $utf8);
I would appreciate any wisdom you could shed on the matter.

-- tel

In reply to Writing unicode characters to file using open($fh, ">:utf8, $name) mangles unicode? by telcontar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.