Re: Perl, Gtk2 and locale — a bit of a mess
by choroba (Cardinal) on Jul 11, 2013 at 13:26 UTC
|
The documentation of POSIX uses a different way to refer to locale subtypes: constants.
perl -we 'use locale; use POSIX; POSIX::setlocale(POSIX::LC_NUMERIC, "
+hu_HU.UTF-8"); printf qq{%f\n}, 123.456;'
Note: Tested with cs_CZ locale, as hu_HU is not installed here.
| [reply] [d/l] [select] |
|
|
Yeah, sorry, just noticed that; if anything it doesn't change a thing.
More precisely, this is what happens in the code now:
use POSIX qw/strftime setlocale/;
BEGIN { ## To hopefully make it happen way before Gtk2 init
setlocale(POSIX::LC_ALL(), "en_US.UTF-8");
}
use Glib qw/FALSE TRUE/;
use Gtk2 -init;
And on my coworker’s computer (who uses Hungarian locale to begin with, on Fedora, GTK 3 desktop), we get commas, on my computer (English locale, KDE desktop, but in the shell the locale is set to Hungarian) we get periods.
Except this is only with the software we’re making. The example one-liners above work identically on both machines. | [reply] [d/l] |
Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Jul 11, 2013 at 15:55 UTC
|
After a talk with Nei on #gtk-perl I’ve come to a few conclusions. For one, I’m really bad at this whole locale thing. For two, this whole locale thing is pretty much broken by design.
There are a few things working together that make this so bad:
- GTK will call setlocale(LC_ALL, "") when it starts, so we were mistaken about the use of the Perl instruction POSIX::setlocale — it should, by all means, go after the Gtk init, so as to actually override whatever Gtk loaded from the environment
- C’s locale support is pretty much broken: there’s apparently no way to say “this is something user-facing, please present it as appropriate” and “this is something that must remain exactly the way I’m saying it”.
- Perl will inherit this behaviour and unless the libraries dealing with numbers setlocale(LC_NUMERIC, "C"), they will end up producing localised numbers the instance they turn it into a string. Which many do.
- JSON gets slightly confused, it appears, by producing a JSON string like { cmd: "something", ts: 1373556417,044533, data: { ... } }
So, for me, the solution is turning locales off on numerals. For others, it would require calling setlocale back and forth. Here is another example of this issue cropping up all of a sudden.
| [reply] [d/l] [select] |
Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Jul 12, 2013 at 11:41 UTC
|
Okay, a little update. Just what is going on in Perl’s mind here?
$ perl -E 'use POSIX qw(setlocale LC_NUMERIC); say 3.14; setlocale(LC_
+NUMERIC, ""); say 3.14; setlocale(LC_NUMERIC, "C"); say 3.14; setloca
+le(LC_NUMERIC, ""); say 3.14; say 3.14 . "";'
3.14
3,14
3.14
3,14
3.14
Switching back and forth seems to work. When the environment locale is loaded, we get a comma, when the C locale is loaded, we get a period. Excellent.
Except for the last two. If you concat anything (could have done it with print 3.14; print "\n"; print 3.14 . "\n"; instead, for the same effect) to the float (turning it into a string much like how you turn it into a string with the print or the say directive, it won’t follow the locale rules. I’d been battling the exact opposite until now!
And when we also use Gtk which will call setlocale on the C level:
~$ perl -E 'use Gtk2 -init; use POSIX qw(setlocale LC_NUMERIC); say 3.
+14; setlocale(LC_NUMERIC, ""); say 3.14; setlocale(LC_NUMERIC, "C");
+say 3.14; setlocale(LC_NUMERIC, ""); say 3.14; say 3.14 . ""; '
3,14
3,14
3.14
3,14
3,14
The first turns into a comma, that’s good, Gtk2 set the locale to Hungarian. The second remains a comma, as expected. The third becomes a period, because of C locale. Fourth is a comma, rightfully, because of the environment locale. And the last one behaves as it should now, remaining a comma, appropriate for the environment locale being used. | [reply] [d/l] [select] |
|
|
Once again IRC help, so for the sake of completeness I’m answering myself: the reason for the awkward difference is compile time optimisation. Where, in the first case, that last concatenation & stringification still happened during initial unset locale.
Note how the last two prints work as expected if one replaces the constant with a subroutine call.
~$ perl -E 'sub x { 3.14 }; use POSIX qw(setlocale LC_NUMERIC); say x;
+ setlocale(LC_NUMERIC, ""); say x; setlocale(LC_NUMERIC, "C"); say x;
+ setlocale(LC_NUMERIC, ""); say x; say x . ""; '
3.14
3,14
3.14
3,14
3,14
| [reply] [d/l] |
Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Sep 23, 2013 at 06:16 UTC
|
| [reply] |
Re: Perl, Gtk2 and locale — a bit of a mess
by Khen1950fx (Canon) on Jul 11, 2013 at 23:20 UTC
|
It don't see it as a problem with locales. For example, try this:
#!/usr/bin/perl -l
use strict;
use warnings;
my $num = 123.456;
str($num);
sub str {
my ($want_num, $width) = @_;
$width = '000';
$want_num =~ tr/./,/;
$want_num = print "$want_num$width";
}
Does that work for you? | [reply] [d/l] |
|
|
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Jul 15, 2013 at 09:45 UTC
|
~$ locale
LANG=hu_HU.UTF-8
LC_CTYPE="hu_HU.UTF-8"
LC_NUMERIC="hu_HU.UTF-8"
LC_TIME="hu_HU.UTF-8"
LC_COLLATE="hu_HU.UTF-8"
LC_MONETARY="hu_HU.UTF-8"
LC_MESSAGES="hu_HU.UTF-8"
LC_PAPER="hu_HU.UTF-8"
LC_NAME="hu_HU.UTF-8"
LC_ADDRESS="hu_HU.UTF-8"
LC_TELEPHONE="hu_HU.UTF-8"
LC_MEASUREMENT="hu_HU.UTF-8"
LC_IDENTIFICATION="hu_HU.UTF-8"
LC_ALL=
~$ perl -e 'use POSIX qw(setlocale LC_ALL); print setlocale(LC_ALL);'
+## Print the current locale
LC_CTYPE=hu_HU.UTF-8;LC_NUMERIC=C;LC_TIME=hu_HU.UTF-8;LC_COLLATE=hu_HU
+.UTF-8;LC_MONETARY=hu_HU.UTF-8;LC_MESSAGES=hu_HU.UTF-8;LC_PAPER=hu_HU
+.UTF-8;LC_NAME=hu_HU.UTF-8;LC_ADDRESS=hu_HU.UTF-8;LC_TELEPHONE=hu_HU.
+UTF-8;LC_MEASUREMENT=hu_HU.UTF-8;LC_IDENTIFICATION=hu_HU.UTF-8
Why is everything set except for LC_NUMERIC? Why is anything set? I never asked for this...
Update: C does the following:
#include <stdio.h>
#include <locale.h>
int main() {
printf("%s\n", setlocale(LC_ALL, NULL));
setlocale(LC_ALL, "");
printf("%s\n", setlocale(LC_ALL, NULL));
return 0;
}
~$ ./localetest
C
hu_HU.UTF-8
| [reply] [d/l] [select] |
|
|
What does the POSIX standard say should happen? What does your system say setlocale should do?
The setlocale link talks about POSIX::setlocale( LC_ALL , "" ) setting stuff like you're seeing, but in perl its the default Usage: POSIX::setlocale(category, locale = 0)
To query you use NULL, perls equivalent is
$ perl -e " use POSIX qw/ setlocale LC_ALL /; print setlocale( LC_ALL
+, undef ); "
English_United States.1252
So I don't think I'm seeing a bug here, looks like its working as designed | [reply] [d/l] [select] |
|
|
POSIX(3pm) says that the Perl equivalent of C’s setlocale(cat, NULL) is setlocale($cat) (i.e. one argument). I’m not explicitly setting the locale to the environment-given locale (setlocale($cat, "") or in C setlocale(cat, "")) until later in the code, so it should default to C locale.
| [reply] [d/l] [select] |
|
|
|
|
|
|
Also, setlocale(3) says the following: “If locale {the second param} is NULL, the current locale is only queried, not modified. On startup of the main program, the portable "C" locale is selected as default. A program may be made portable to all locales by calling setlocale(LC_ALL, "" ) after program initialization”
| [reply] |