saintmike has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed monks,

I'm trying to find out what happens if a user with an international keyboard types something like

$ myprogram Belvédère
Which format does the application myprogram get the accented characters in, ISO-8859-1 or UTF-8? I guess this would depend on the type of terminal used, so let's say, Linux/X11?

Also, is there a general rule or a way to find out, like, by looking at the locale?

I can't really reproduce it here without an intl keyboard, any help is greatly appreciated ...

Replies are listed 'Best First'.
Re: Character Encoding of Keyboard Input
by gaal (Parson) on Mar 02, 2005 at 18:31 UTC
    This depends on the locale. On modern linux systems, it would often be something like en_US.UTF-8, and the terminal emulator would be xterm -u8 or one of the fancy GNOME / KDE terms which also support unicode. What does

    env | grep LANG

    say?

    Note, BTW, that as far as perl is concerned, this only incidentally has to do with *keyboard* input, since myprogram is getting its data in @ARGV.

Re: Character Encoding of Keyboard Input
by fizbin (Chaplain) on Mar 02, 2005 at 19:03 UTC
    Yes, the answer is "it depends" - this issue is perhaps best dealt with by looking at this recent post, and its replies: 433998.
    -- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/