in reply to Re^3: problem with hashes, loaded from file
in thread problem with hashes, loaded from file

Unfortunately, can't do that, I tried to turn everything to unicode (utf8) before, but it mangled filenames, with erroneous unicode encoding, so, I've had to reference filenames in byte, not character sequences.
  • Comment on Re^4: problem with hashes, loaded from file

Replies are listed 'Best First'.
Re^5: problem with hashes, loaded from file
by Anonymous Monk on Dec 25, 2014 at 19:35 UTC
    Well if I were a sys admin I would be glad that I found some broken file names (rather then not knowing they're there). Anyway, locale doesn't do much (I think pre-5.020 LC_CTYPES simply doesn't work with utf-8 locales) and encoding is problematic because it is global, which can break quite a lot of modules. Also, encoding is deprecated as of 5.018. I personally use
    use utf8; use open qw( :encoding(utf-8) :std ); use Encode;
      I only a few years ago started writing programs with Russian text IO in mind. Before that, I've just been saving perl program's files in utf8, thinking, that it makes my programs unicode enabled, as read in different places about it. I've had to start including those unicode and locale settings, after I've noticed that Russian text is scrambled on the output, but I am pretty vague about utf8 in perl, because, there is so much I have and want to learn in perl other, then unicode, but, when I look in unicode/utf8 family of perl documentation, there is so much to read (which I've done more deeply in the last few months, but in comparison of how much is left to read on that subject, it's almost nothing), that I always prefer reading on other subjects in perl, than unicode. I read more about unicode, when encounter some problem. I have a perl version 5.14 installed on my comp. I've recently installed new servers, ran my programs on them and noticed that there are warnings about deprecated encoding. The servers have perl 5.18 installed. I was thinking about updating my code, since those unicode settings were added to all of my libraries. Didn't do it yet, because of other deadlines that I have to meet. I will try your suggestion and see what it will do to me.
        Yes, it is very unfortunate. Perl's Unicode capabilities are some of the best among all programming languages (I think only ICU is comparable?), but its string handling is very confusing, and the documentation is huge and all over the place. There were discussions about that already... Basically, the best way is to decode all input and encode all output. The main tools are: use utf8, use open... (the pragma), open the function, binmode and Encode. Of course, if some filenames are not valid utf8, they can't be decoded as such.