in reply to File::Find returning utf-8 characters

At least I think it's UTF-8.

It appears to be. UTF-8 is the same as ASCII for the lowest 127 bytes of the ASCII character set, and your ASCII characters in the lower 127 still line up.

aside -- The niceness of UTF-8 is why Linux C apps don't have to worry about Unicode much, UTF-8 is just another bytestream to them -- and they don't know what is Unicode and what isn't. Dealing with other encodings is more painful, and Microsoft C/C++ likes to work in Double byte (WCHAR). I experienced this fun recently when dealing with Unicode enabled shared library we were writing -- UTF-8 is much better than the alternatives.

  • Comment on Re: File::Find returning utf-8 characters