Cannot preserve Latin 2 character sets in Perl

PerlPksky has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Cannot preserve Latin 2 character sets in Perl by kennethk (Abbot) on Sep 29, 2011 at 17:11 UTC
The standard reference documents on unicode in Perl are: perlunitut - Perl Unicode Tutorial perlunifaq - Perl Unicode FAQ perluniintro - Perl Unicode introduction perlunicode - Unicode support in Perl perluniprops - Index of Unicode Version 6.0.0 properties in Perl	[reply]
Re: Cannot preserve Latin 2 character sets in Perl by Anonymous Monk on Sep 29, 2011 at 16:52 UTC
Perlmonks is buggy, and the local gods refuse to fix it, can't help with that. As for your actual question, you need to learn about the topic of encoding. Start by reading http://p3rl.org/UNI. The following program does what you expect. `use utf8; use Encode qw(encode); print encode 'UTF-8', 'Župljanin and Stanišić';`	[reply]
Re^2: Cannot preserve Latin 2 character sets in Perl by PerlPksky (Initiate) on Sep 29, 2011 at 17:11 UTC
That didn't work exactly, but I poked around a little and tried, use utf8; use Encode qw(encode); print encode 'iso-8859-2', 'Župljanin and Stanišić'."\n"; And that worked. I should add that "use utf8;" has got to be there, it doesn't work without it. Thanks for the tip.	[reply]
Re: Cannot preserve Latin 2 character sets in Perl by moritz (Cardinal) on Sep 29, 2011 at 18:26 UTC
You need to supply more context. By default, Perl treats literal strings as bytes, so if your script is stored in the encoding that your console accepts, it should work. If it doesn't work, there is probably a mismatch between these two encodings, but that's hard to diagnose without even knowing which operating system you use. If this is some Unix dialect, what's your locale? What terminal or terminal emulator are you using? Which editor do your use, and what character encoding does it store files in? Perl 6 - second systems done right	[reply]
Re^2: Cannot preserve Latin 2 character sets in Perl by PerlPksky (Initiate) on Sep 29, 2011 at 22:19 UTC
OK. I am editing on Windows XP Pro using Notepad++ and Dreamweaver. The file is being edited from a remote installation on a LAN that is an SME Server, a Linux distribution. I run the remote script using PuTTY. I usually have to change the translation settings on PuTTY to Latin 2 to get uncorrupted characters on output. What I am trying to do is take output from MySQL and print up an HTML page that displays these Croatian characters. MySQL requires some tinkering to get it to handle these characters correctly and only a recent version past 5.0 will do it. But at this point the difficulty is taking the characters that have survived output from MySQL and print them to a page generated by Perl. So the code is something like: $HTML = encode( "iso-8859-2", $HTML); while (<OLD>) { s!<body>.*</body>!<body>$HTML</body>!gs; print NEW $_ or die "can't write $new: $!"; } But this isn't working. The output has changed, but the characters are not being preserved. I am getting erroneous characters instead of little squares on browser output. What am I missing?	[reply]
Re^3: Cannot preserve Latin 2 character sets in Perl by moritz (Cardinal) on Sep 30, 2011 at 06:48 UTC
Is the problem you describe related to your original problem at all? I don't remember reading anything about mysql, CGI, browsers etc. Anyway, what charset do you specify in the Content-Type header? See also: Character encodings and Unicode in Perl. Perl 6 - second systems done right	[reply]
Problem solved by PerlPksky (Initiate) on Sep 30, 2011 at 16:14 UTC