Ideally Perl should use the widechar interfaces internally by default, and return a utf8 string when needed, and otherwise return a latin-1 string. The problem as I recall is that the interface for these routines is char * with no flag for unicode filesystem semantics.
The problem here is that the default behaviour is based on the kludge of utf8 as employed by the *nix world. That being that you know your filenames are utf8 based on your locale, which is really a completely retarded idea, but backwards compatible with code that is not unicode aware.
Oh, and before any *nix zealot decides to lecture me on how much smarter the *nix solution is please go and read the history of the creation of utf8, it was specifically designed as a workaround for legacy computer systems (UNIX specifically) to handle unicode and was always intended to be replaced by better mechanisms at a later date. But workarounds have a nasty habit of lasting much much longer than most people realize.
In reply to Re: unicode version of readdir
by demerphq
in thread unicode version of readdir
by dk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |