in reply to hard-coded hash return nothing for utf-8 string

*the perl version is 5.8.0 (unix) and I can't upgrade it; store the perl file in utf-8 and call "use utf8" is not sound good as I do the programming on window, the utf-8 signature kill the perl interpreter
Regardless of this bug is really a perl bug or something in your code, for serious unicode work you should upgrade both perls to at least 5.8.8. Various utf8 bugs have been fixed and semantics have been changed since 5.8.0. perl581delta says for instance:
For example, if you had "en_US.UTF-8" as your locale, your STDIN and STDOUT were automatically "UTF-8", in other words an implicit bin‐ mode(..., ":utf8") was made. This meant that trying to print, say, chr(0xff), ended up printing the bytes 0xc3 0xbf. Hardly what you had in mind unless you were aware of this feature of Perl 5.8.0. The problem is that the vast majority of people weren’t: for example in RedHat releases 8 and 9 the default locale setting is UTF-8, so all RedHat users got UTF-8 filehandles, whether they wanted it or not. The pain was intensified by the Unicode implementation of Perl 5.8.0 (still) having nasty bugs, especially related to the use of s/// and tr///. (Bugs that have been fixed in 5.8.1)

Therefore a decision was made to backtrack the feature and change it from implicit silent default to explicit conscious option. The new Perl command line option "-C" and its counterpart environment vari‐ able PERL_UNICODE can now be used to control how Perl and Unicode interact at interfaces like I/O and for example the command line arguments. See "-C" in perlrun and "PERL_UNICODE" in perlrun for more information.

Yes I noticed that you said you can't upgrade it. Ask again.

  • Comment on Re: hard-coded hash return nothing for utf-8 string

Replies are listed 'Best First'.
Re^2: hard-coded hash return nothing for utf-8 string
by vic (Initiate) on Feb 22, 2008 at 05:05 UTC

    Thank you for your detailed explaination.

    Now I double checked the perl version, and this is 5.8.5. I will try to reproduce the bug later.