duc has asked for the wisdom of the Perl Monks concerning the following question:

Hi!

(All right, I know : me again ;)

So here is what I need to do : I have to modify some registry keys. The name of the key to modify is entered by the user (it really needs to be). Unfortunately, in some cases the name of the key contains French accent. I have been making quick tests using Win32::TieRegistry and Encode. Here is one of them :

use Encode; use Win32::TieRegistry; # #my @allencodings = Encode->encodings(":all"); #print join("\n", @allencodings); my $val; my $val_en; print "Enter a name with French accent:\n"; $val_en = encode("utf8", <STDIN>); chomp $val_en; #registry to modify (depends on program version my $reg = "HKEY_LOCAL_MACHINE/SOFTWARE/"; $val = decode("utf8",$val_en); #sets the delimiter to / for the registry reading and writing $Registry->Delimiter("/"); $testKey = $Registry->{$reg}; #structure of the registry %hashReg = ("Applications" => {"$val" => {"1.8" =>{"Directories" => {"Config" => "test"}}}}); $testKey->StoreKey("Compagnie/", \%hashReg); print "\nDone!";


If I print the value decode $val instead of using it in the registry, it works fine. For example if I enter "tiré" this is what is going to be printed on the screen. But, in the registry I have this "tirÂ," which is quite different.

I have tried different decoding, but so far none worked. So I was wondering if somebody knew how to transfer the French accent into the registry.

Replies are listed 'Best First'.
Re: French Accent in Windows Registry (A/W)
by tye (Sage) on Jul 27, 2006 at 14:05 UTC

    Win32::TieRegistry uses the *A APIs ("ANSI") for the Registry and so needs 8-bit characters encoded to match your locale. Win32API::Registry exposes the *W APIs ("wide") for the Registry which would allow (and require) you to use Windows "UNICODE" characters (UCS-2LE last I checked but if it became UTF-16, it likely doesn't matter for French characters).

    So, likely the easiest solution is to convert your strings to the current locale's 8-bit encoding and continue to use Win32::TieRegistry.

    - tye        

      After your answer i realized that the problem is more interesting than i thought :) I tried to add a Russian word to registry, and it appeared to be far more tricky than i expected :) I will correct my previous comment.
      *A APIs ("ANSI")
      It means that Windows, not Win32::TieRegistry decides which charset is used in the input?

           s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print
        Well, Thanks for your kindness and If you do find a way to do it tell me, because I am sill looking.. I thought it would be easy :| I guess I was wrong :o BTW, I have tried from_to($val, "cp850","cp1252"), but still got "Pr,cis" instead of Précis...
Re: French Accent in Windows Registry
by Ieronim (Friar) on Jul 27, 2006 at 13:28 UTC
    Your code does something very silly. You confuse encode and decode. I myself confused them for a very long time, and learned the difference only on PerlMonks.

    Let's look what you do: at first you encode the string received from terminal into utf8, and then you DECODE IT BACK! So it's again in the encoding of your terminal. So there is no wonder that you can print it back to the terminal, but the Windows Registry of course misinterpretes it.

    You need

    $val = <STDIN>; from_to($val, $enc, "UTF-16");
    $val = <STDIN>; from_to($val, $enc, "UTF-16");
    where $enc is the encoding of your input data, 'cp1250', i suppose.
    $val = <STDIN>; from_to($val, 'cp850', 'cp1252');
    UPDATE: Added the most likely variant according to tye's note—but it can still be wrong. You need to convert from 'dos' French codepage to 'Windows' French codepage. I inserted the values for Western European codepage.

         s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print

      I knew I was doing something wrong ! I have a little something against unclear documentation... Anyway, I guess it is a start but since cp1250 with UTF-16 combination is not working, I will have to go through the possibilities !

      Thanks a lot to you !

Re: French Accent in Windows Registry
by gellyfish (Monsignor) on Jul 27, 2006 at 12:53 UTC

    Are you sure that the registry keys are utf-8 ? Microsoft "Unicode" is occasionally UCS-2 or UTF-16

    /J\

      This was exactly my point, I have tried different decoding format, but I did not find the one I needed. I will try UCS-2 because I don't have the option UTF-16. Thanks for the info :)

      It isn't UTF-16, UTF-16LE, UTF-16BE, UCS-2BE or UCS-2LE.. :( I wil find it !!! No matter what it costs ;)
        Your answer makes no sense. Windows (starting from 2000) uses UTF-16, surely. And this encoding is of course supported by Encode.pm.

             s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print