I would expect this to be totally revised, when Perl 6 comes out, otherwise I feel some real worry here.

I'm not quite sure what you're getting at here. You will always need to tell Perl that you want it to use UTF-8 encoding when you read a specific file. Sure in the future some of the region-specific encodings such as Latin-1 might lose popularity to Unicode. But if Perl assumed every file was a UTF-8 character stream then Perl would no longer be able to read binary byte streams (or even UTF-16 encoded).

The XML spec provides a way for a program to unambiguously determine the encoding of an XML document. In the absense of this type of in-band information in other text file formats, you will need to specify an encoding.

As you point out, 5.8 provides the very powerful IO layer model for dealing with this and other problems. I don't expect IO layers to disappear in 6.0. And for people stuck with 5.6, pack hack's do provide a workaround.

What is expected to change in the future is that Perl will assume your script itself is UTF-8 encoded. Assuming you use a UTF-8 aware editor, that will allow you to include non-ASCII characters in string literals simply by typing them. At the moment if you want to do that you have to say 'use utf8' in the future that will be assumed (and to quote the docs, "'use utf8' will become a noop").


In reply to Re: Re: Setting UTF-8 mode on filehandle reads? by grantm
in thread Setting UTF-8 mode on filehandle reads? by jkahn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.