As kcott has already pointed out, you don't need any special handling to only read directory content and then write it to a file. I just hope to add a little explanation to this stuff.

Effectively, there are "bytes" and "characters". In "bytes" each element is only a number from 0 to 255. In "characters" the values have range from 0 to 0xFFFFFFFF and correspond to some image which can be drawn on screen or paper. The "encoding", "unicode", "locale" and other stuff provide description on how to convert "bytes" into "characters" and back. A "string" can be either sequence of "bytes" or sequence of "characters".

Exchange between perl program and OS happens only using "bytes". Inside perl one can work either with "bytes" or with "characters", but results of such work will be different. For example, if your regular expression is supposed to work with unicode characters, then it will fail when applied to "bytes", but it shall work if applied to "characters". So, if you only get data from OS and then immediately return it back to OS, then you don't have to bother with conversion from "bytes" to "characters", it would be just waste of time. On the other hand, if your code has "use utf8", then all your string literals will be automatically presented as "characters" in perl. So, if you decide to pass such literal to OS, then you must convert it from "characters" to "bytes". That is why some people were describing such procedure for opendir here.

There are different ways to do the conversion. One can use Encode module directly, or one can pass ":encoding" to open function, or use some other way. But one has to have clear understanding why the conversion is done, and whether it is needed at all.


In reply to Re: UTF-8 and readdir, etc. by andal
in thread UTF-8 and readdir, etc. by jrw005

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.