in reply to Using setlocale() on Windows with utf-8 support

Rob, locale identifiers are platform-dependent. The id "de" probably works nowhere, "de_DE" will work on most Unix systems, if the locale "de_DE" is installed. On Windows you have to use "German" or "German_Germany" which is the same. But all that is clear and not the question.

The question was: Is it possible to activate a UTF-8 locale on Windows? The Microsoft documentation claims that it is possible but how can that feature be used from Perl?

  • Comment on Re: Using setlocale() on Windows with utf-8 support

Replies are listed 'Best First'.
Re^2: Using setlocale() on Windows with utf-8 support
by syphilis (Archbishop) on Jul 18, 2023 at 12:49 UTC
    The question was: Is it possible to activate a UTF-8 locale on Windows?

    It looks like it's probably working for me on Windows 11, but only if the C toolchain that built perl (or that builds your executable) is a Microsoft one.
    Here's a copy'n'paste (that doesn't render exactly as it appears) of what I get, having built your demo C program (into try.exe) using Visual Studio 2022:
    D:\C>try.exe German.utf8 German_Germany.utf8: März (5 bytes)
    And here's what I get using perl-5.38.0 that was built with the same Visual Studio 2022 compiler:
    D:\>perl -MPOSIX -wle "$loc = POSIX::setlocale( LC_ALL, 'German.utf8' +); print $loc;" German_Germany.utf8
    But if I use my perl-5.38.0 that was built with a mingw-w64 port of gcc-13.1.0, then I get:
    D:\>perl -MPOSIX -wle "$loc = POSIX::setlocale( LC_ALL, 'German.utf8' +); print $loc;" Use of uninitialized value $loc in print at -e line 1.
    And if I use that gcc-13.1.0 to build your C program into try_gcc.exe, then I get:
    D:\C>try_gcc.exe German.utf8 (null): March (5 bytes)
    From which I deduce that the behavior you need has not yet been ported to the mingw-w64 toolchain.
    If you need it to work with the mingw-w64 compilers then you could make enquiries about that by (eg) posting to mingw-w64-public@lists.sourceforge.net .

    Cheers,
    Rob

      That was very helpful! Thanks!

      I will have a look at the mingw-64 sources and maybe file an issue.

      Cheers,
      Guido

        I will have a look at the mingw-64 sources and maybe file an issue

        Seems to be a runtime issue.
        https://winlibs.com provides a gcc-13.2.0 build with msv C runtime, and a separate gcc-13.2.0 build with universal C runtime.
        The one with UCRT provides the utf-8 support, the one with MSVCRT does not.

        Strawberry Perl uses the MSVCRT one, and if you want to build perl-5.38.0 (or earlier) with the UCRT one, you'll need to patch the perl source.
        But perl-5.39.2 builds straight out of the box with the UCRT one.
        Hence my own personal build of perl-5.39.2 (built by gcc-13.2.0 UCRT) provides the desired behaviour.

        It might be that the UCRT version of gcc-13.1.0 (and perhaps earlier) might also provide that utf-8 support. (I haven't tested.)
        In any case, one can grab the UCRT gcc-13.2.0 and check that C programs are receiving that utf-8 support.

        Cheers,
        Rob