in reply to UTF-8 and readdir, etc.

The comments here are appallingly ignorant, and sadly the perl implementation on Windows follows suit. NTFS filenames are encoded in UTF-16, and perl *could* handle that correctly, but it doesn't. So you have to use something like Win32::Unicode, or if you're using cygwin (as I am), you have to use decode_utf8 when reading directories. Note that File::Find doesn't know this, so that's not usable on Windows.

Replies are listed 'Best First'.
Re^2: UTF-8 and readdir, etc.
by Your Mother (Archbishop) on Sep 12, 2019 at 22:52 UTC

    Ignorance, and terrible design, abounds–

    NTFS stores file names in Unicode.The Horse’s Mouth :(

    –and–

    NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.) except 0x0000. This means (case insensitive) UTF-16 code units are supported, but the file system does not check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode standard) –Wackypardia