in reply to Re: directories and charsets
in thread directories and charsets
That is exactly the kind of problem I was talking about in my first convoluted message.
From the documentation (perl Unicode etc..) and from my personal tests, it would seem that readdir() always returns strings that are "bytes" (not internally marked as "utf8" by Perl). This is per the Encode::is_utf8($dir_entry) function.
However, at some point it seems that after concatenating that directory entry with, for instance, the directory path whence it comes, and trying a "if (-f $fullpath)", the answer is false.
I was now testing on a Windows machine, and I thought that Windows NTFS was storing filenames as UTF-8. But you seem to say that this is not true, and that it is UCS-2 instead. That might explain why, when trying various permutations and encodings or decodings of my filenames, I am getting errors.
Back to testing thus, with this exciting new possibility..
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: directories and charsets
by jbert (Priest) on Mar 15, 2007 at 16:46 UTC |