OK. Colour me supprised.

When I wrote that I would be surprised if stat or rename had problems with files that contain odd characters, I was actualy thinking of characters is the ASCII character set, not Unicode, howerver, I am suprised and disapointed that perl cannot transparently handle unicode in filenames.

Perl has for many years transparently handled unicode in string varables. There are of course many pitfalls in constructing those strings from data external to the script, but in this case that should not be the programer's problem. Perl's readdir should just make the appropare Windows System calls to get the unicode filename, and store that filename, complete with any unicode in an internal string.

The programer should then be able to read and write to files with those names without worring if they contain unicode or not. Obvously if the programer is transforming filenames they they need to be carefull, but in many cases that is not an issue. It is far more common to open and read a file than it is to rename one.

I think that it is a mistake in 2011 for perl to deleberately use the old Win9x API to get an ASCII filename for backwards compatibility reasons, when the last Win9x OS was retired many years ago.


In reply to Re^3: Handling windoze filenames with odd charactters by chrestomanci
in thread Handling windoze filenames with odd charactters by cormanaz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.