I raised what I'm guessing are some related problems in Setting UTF-8 mode on filehandle reads? a few months back. The best responses are from Re: Setting UTF-8 mode on filehandle reads? and Re: Setting UTF-8 mode on filehandle reads? from diotalevi and grantm respectively. (May their tribes increase.)

I have a version of 5.8 now (hurray for getting off Windows!), and I wrote a little test script in 5.8 to see how one can get away with pushing encoding methods:

#!/usr/bin/perl -w use utf8; # without the encoding layer open (my $fh1, ">", "test-normal"); # with the encoding layer open (my $fh2, ">:utf8", "test-utf8"); # ASCII data # encodes the same in UTF-8 or Latin-1 encodings my $ascdata = "aei\n"; print $fh1 $ascdata; print $fh2 $ascdata; # accented a e i my $l1data = "\xe1\xe9\xed\n"; # these characters *can* be encoded in Latin-1 or in UTF-8 # (though differently for each) print $fh1 $l1data; print $fh2 $l1data; # U+0641 ARABIC LETTER FEH my $u8data = "\x{0641}\n"; # "Arabic-Feh" can't be encoded in Latin-1, can be encoded in UTF-8 print $fh1 $u8data; # <--THIS LINE GENERATES WARNING print $fh2 $u8data;
Here's the results, when checked with od:
[jeremy@serpent pm-test]$ perl wide-char.pl Wide character in print at wide-char.pl line 27. [jeremy@serpent pm-test]$ od -t x1 test-normal 0000000 61 65 69 0a e1 e9 ed 0a d9 81 0a 0000013 [jeremy@serpent pm-test]$ od -t x1 test-utf8 0000000 61 65 69 0a c3 a1 c3 a9 c3 ad 0a d9 81 0a 0000016 [jeremy@serpent pm-test]$

I recognize this isn't exactly what you were asking, but I suspect that the utf8 pragma and the :utf8 encoding layer are getting mixed up somewhere in your code -- one or another is missing, etc.

More specifically, it sounds like you're trying to print a character with a chr value larger than 0xff on a Latin-1 filehandle. Those characters, aren't encodable like that, so you're running the risk of losing data. This is a problem and I encourage you to track it down. Turning off a warning isn't the same as fixing the cause of one.

It might help if you would post a snippet that exhibits the warning. Warnings are usually there for a reason, and perhaps there's something in your code that is a bit sketchy from the compiler's point-of-view.

Hope that helps.

Update: I've just noticed that Perl's coping behavior for characters greater than 0xff on a non-utf8 output filehandle is to print the utf-8 encoding of that character anyway: note that the last two bytes before the newline in both examples are d9 81.

No wonder you get a warning. There's no systematic way to recover whether the output data was originally UTF-8 or not!

Update 2: Cleaned up comments by using the word "encoded" instead of "printed".


In reply to Re: how do you turn off the following warning: wide character in print script.pl line 12 by jkahn
in thread how to turn off warning: wide character in print ? by Isanchez

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.