Please read perlunitut and perluniadvice.
Filenames live outside Perl, so you need to decode and encode them explicitly (and then hope the bytes are exactly how the file is stored) every time. In other words: for filenames, use byte strings, not unicode text strings.
A filename that you get from readdir or glob is already a properly encoded byte string. You can use it to open a file, without decoding or encoding the string.
And the question is : what kind of character encoding will the directory entry "Presentación.ppt" be in, and on what does it depend?
The character encoding will depend on the filesystem, and encoding layers used by the implementation of the filesystem, if any. In any case, you cannot be sure in a platform independent way. (Yes, that sucks.)
Perl reads the entry with all it's bytes correctly, but $entry does not have the "is_utf8" flag set).
That would be wrong. A filename is a binary string, not a text string. It consists of bytes, not characters. In order to use it as a text string, you have to decode it first. But it can be very hard to find out HOW to decode it, and certainly perl can't figure it out for you.
it returns an error.
Which error?
In reply to Re: utf8 in directory and filenames
by Juerd
in thread utf8 in directory and filenames
by soliplaya
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |