vkon has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I seemingly miss something obvious, but reading File::Slurp manpage does not give me an idea.

Following piece of code dies and I do not have my content in file "a.txt" (it has zero size):

use utf8; use File::Slurp; use Encode; my $ustr = "simple unicode string \x{0434} indeed"; #Encode::_utf8_off($ustr); # ???? why do I need this? File::Slurp::write_file("a.txt", {binmode=>':utf8'}, $ustr);
Error message I get is: Wide character in syswrite at d:/perl5121/site/lib/File/Slurp.pm line 330. Notice the commented out Encode::_utf8_off($ustr); line.

If I uncomment this line, i.e. if I "reset" $ustr to be *not* utf8, all is good.

So, File::Slurp refuses to write scalar having UTF8, despite of fact that I specified {binmode=>':utf8'}, like manpage recommends

What I am doing wrong?

Replies are listed 'Best First'.
Re: can not use utf8 with File::Slurp
by Corion (Patriarch) on Feb 15, 2011 at 15:57 UTC

    Looking at File::Slurp, it thinks binmode is only for cr/lf translation. It is broken that way, because it never calls binmode but does some fiddling with O_ flags and sysopen instead. I think it completely ignores PerlIO in favour of speed. My recommendation is to replace File::Slurp with:

    sub write_file_utf8 { my $name = shift; open my $fh, '>:encoding(UTF-8)', $name or die "Couldn't create '$name': $!"; local $/; print {$fh} $_ for @_; };

    This likely is a bit slower, but likely it works, as opposed to File::Slurp. You might also want to open a bug against File::Slurp.

      I think it completely ignores PerlIO in favour of speed.

      Nit: sysopen doesn't stop PerlIO from being used. Either your Perl uses PerlIO or it doesn't. (Perl is built to use PerlIO by default since 5.8. I wouldn't be surprised if Perl didn't work without PerlIO anymore.) A given Perl only supports one kind of file handle. It doesn't matter whether you use open or sysopen.

      That doesn't mean there is no difference. When using O_TEXT on Windows, Perl might let clib do the LF⇒CRLF translation instead of using the :crlf layer. That could be faster, but I would hope it's not.

Re: can not use utf8 with File::Slurp
by mje (Curate) on Feb 15, 2011 at 16:22 UTC
Re: can not use utf8 with File::Slurp
by ikegami (Patriarch) on Feb 15, 2011 at 16:49 UTC
      never heard of this, thanks for pointing out.

      Indeed, use Encode; then Encode::_utf8_off to only reset UTF8 bit - is overkill.

        I didn't say it was overkill; I said it was wrong. It is not the purpose of _utf8_off to encode to UTF-8, and it will only do so for some inputs.
Re: can not use utf8 with File::Slurp
by repellent (Priest) on Feb 19, 2011 at 00:14 UTC
    You can pass a filehandle to write_file:
    use File::Slurp qw(write_file); my $ustr = "simple unicode string \x{0434} indeed"; { open(my $FH, ">:encoding(UTF-8)", "a.txt") or die "Failed to open file - $!"; write_file($FH, $ustr) or warn "Failed write_file"; }
      I've used File::Slurp::write_file only to have shorter code.

      If I have open filehandle, then just using "print" would be even shorter :)

      Thanks for your another bit of knowledge, however!

Re: can not use utf8 with File::Slurp
by vkon (Curate) on Feb 15, 2011 at 16:46 UTC
    thanks all for the answers,
    sheds a light to the problem :)