in reply to UTF8 hash key downgraded when assigned

Hi gibus,
The code you posted won't even compile for me (on Windows):
Malformed UTF-8 character: \xe9\x27\x20 (unexpected non-continuation b +yte 0x27,immediately after start byte 0xe9; need 3 bytes, got 1) at t +ry.pl line 11. Malformed UTF-8 character (fatal) at try.pl line 11.
Seems to work fine for me if I rewrite your script as:
#!/usr/bin/perl use strict; use warnings; use utf8; use Devel::Peek; use Data::Dumper; $Data::Dumper::Useqq = 1; my $s; { no utf8; $s = 'clé'; } utf8::upgrade($s); my %hash = ( $s => 0, ); my $key = (keys %hash)[0]; Dump($key); print Dumper($key); $hash{$s} = 1; $key = (keys %hash)[0]; Dump($key); print Dumper($key); utf8::upgrade($key); # does nothing Dump($key); print Dumper($key);
UPDATE: When I initially posted this rewritten version, my utf8::upgrade($s); was done inside the no utf8{} block - which is rather counter-intuitive, to say the least.
So I've subsequently moved it outside the no utf8{} block.

UPDATE 2: The output of my modified script:
SV = PV(0x84c2a8) at 0x373100 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x247ede8 "cl\303\251"\0 [UTF8 "cl\x{e9}"] CUR = 4 LEN = 5 $VAR1 = "cl\x{e9}"; SV = PV(0x84c2a8) at 0x373100 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x247f4d8 "cl\303\251"\0 [UTF8 "cl\x{e9}"] CUR = 4 LEN = 5 $VAR1 = "cl\x{e9}"; SV = PV(0x84c2a8) at 0x373100 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x247f4d8 "cl\303\251"\0 [UTF8 "cl\x{e9}"] CUR = 4 LEN = 5 $VAR1 = "cl\x{e9}";

HTH.

Cheers,
Rob

Replies are listed 'Best First'.
Re^2: UTF8 hash key downgraded when assigned
by choroba (Cardinal) on Dec 01, 2018 at 01:07 UTC
    Have you saved it as UTF-8? \xe9\x27\x20 seems to be cp1252 for é'.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Have you saved it as UTF-8?

      No ... and can't immediately find a way of doing so on this Windows machine.
      Is that the reason the script, as posted by the OP, failed to compile for me ?

      I thought that my script might have been relevant, since its output matched the output the OP expected.
      But if it's not relevant then please let me know (and I'll mark it so).

      Cheers,
      Rob

        Yes. You told Perl to expect UTF-8, but didn't provide it. Notepad can "Save As" UTF-8.