Re^4: problem with hashes, loaded from file

Replies are listed 'Best First'.
Re^5: problem with hashes, loaded from file by Anonymous Monk on Dec 25, 2014 at 19:35 UTC
Well if I were a sys admin I would be glad that I found some broken file names (rather then not knowing they're there). Anyway, locale doesn't do much (I think pre-5.020 LC_CTYPES simply doesn't work with utf-8 locales) and encoding is problematic because it is global, which can break quite a lot of modules. Also, encoding is deprecated as of 5.018. I personally use `use utf8; use open qw( :encoding(utf-8) :std ); use Encode;` [download]	[reply] [d/l]
Re^6: problem with hashes, loaded from file by igoryonya (Pilgrim) on Dec 25, 2014 at 20:49 UTC
I only a few years ago started writing programs with Russian text IO in mind. Before that, I've just been saving perl program's files in utf8, thinking, that it makes my programs unicode enabled, as read in different places about it. I've had to start including those unicode and locale settings, after I've noticed that Russian text is scrambled on the output, but I am pretty vague about utf8 in perl, because, there is so much I have and want to learn in perl other, then unicode, but, when I look in unicode/utf8 family of perl documentation, there is so much to read (which I've done more deeply in the last few months, but in comparison of how much is left to read on that subject, it's almost nothing), that I always prefer reading on other subjects in perl, than unicode. I read more about unicode, when encounter some problem. I have a perl version 5.14 installed on my comp. I've recently installed new servers, ran my programs on them and noticed that there are warnings about deprecated encoding. The servers have perl 5.18 installed. I was thinking about updating my code, since those unicode settings were added to all of my libraries. Didn't do it yet, because of other deadlines that I have to meet. I will try your suggestion and see what it will do to me.	[reply]
Re^7: problem with hashes, loaded from file by Anonymous Monk on Dec 25, 2014 at 21:10 UTC
Yes, it is very unfortunate. Perl's Unicode capabilities are some of the best among all programming languages (I think only ICU is comparable?), but its string handling is very confusing, and the documentation is huge and all over the place. There were discussions about that already... Basically, the best way is to decode all input and encode all output. The main tools are: `use utf8`, `use open...` (the pragma), `open` the function, `binmode` and `Encode`. Of course, if some filenames are not valid utf8, they can't be decoded as such.	[reply] [d/l] [select]