Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: UTF-8 and readdir, etc.

by andal (Hermit)
on Feb 01, 2018 at 08:25 UTC ( #1208231=note: print w/replies, xml ) Need Help??


in reply to UTF-8 and readdir, etc.

As kcott has already pointed out, you don't need any special handling to only read directory content and then write it to a file. I just hope to add a little explanation to this stuff.

Effectively, there are "bytes" and "characters". In "bytes" each element is only a number from 0 to 255. In "characters" the values have range from 0 to 0xFFFFFFFF and correspond to some image which can be drawn on screen or paper. The "encoding", "unicode", "locale" and other stuff provide description on how to convert "bytes" into "characters" and back. A "string" can be either sequence of "bytes" or sequence of "characters".

Exchange between perl program and OS happens only using "bytes". Inside perl one can work either with "bytes" or with "characters", but results of such work will be different. For example, if your regular expression is supposed to work with unicode characters, then it will fail when applied to "bytes", but it shall work if applied to "characters". So, if you only get data from OS and then immediately return it back to OS, then you don't have to bother with conversion from "bytes" to "characters", it would be just waste of time. On the other hand, if your code has "use utf8", then all your string literals will be automatically presented as "characters" in perl. So, if you decide to pass such literal to OS, then you must convert it from "characters" to "bytes". That is why some people were describing such procedure for opendir here.

There are different ways to do the conversion. One can use Encode module directly, or one can pass ":encoding" to open function, or use some other way. But one has to have clear understanding why the conversion is done, and whether it is needed at all.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1208231]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2022-11-28 19:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?