I was researching how it would be possible to receive results of readdir in utf8, and didn't find anything useful. The problem is that I want to read file names on win32 that contain non-latin character, that are mapped to '?' within my codepage.
I found that the problem was discussed before, but couldn't find any suitable solutions, jperl hacks being discontinued and Win32API::File not having FindFirst/FindNext entries.
I was thinking if it is indeed not possible, of introducing some switches in perl core that would trigger behavior of readdir between bytes and utf8. Next steps probably would be that open would recognize utf8 file names as well, but that's for later.
Another aspect is that the problem is wider than win32 - it is perfectly legal to create utf8 file names on unix file systems (of course one can always treat them as non-unicode names, which is not possible on win32); gnome utilities use this feature when run under UTF8 locales. The point is if someone a) explicitly knows that his files have utf8 names and b) wants them to be accessed with perl utf8 semantics and little hassle (and irrespective of the locale!), there's no way to do that except to mess with Encode.
So my questions are:
- Can (as of now) readdir return utf8 scalars?
- If not, is this a good idea to introduce such changes in core?
- If yes, what would be the most desirable format of the trigger? A new system var f.ex. $UTF8_FILENAMES or a new
pragma like "use utf8 'filenames'" or "use utf8_filenames"
or ...?
Thank you!
In reply to unicode version of readdir by dk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |