Re: Is it perl or MySQL? ü ends up in CGI script
by moritz (Cardinal) on Feb 08, 2008 at 14:28 UTC
|
Is is possible to have utf8 data in a utf8 table, but the data is outputtet by mysql as latin1?
It should be possible, but IMHO that's not worth the trouble.
Really, charset issues are ugly to work with, and increasing the ugliness by using a terminal with a different encoding than you work with is just plain insane.
I tried that for a while, end then I switched everything to UTF-8 - my files, terminal, locales, cgi output, screen, irssi - you name it.
So my advice is:
- Get a clean environment with utf-8 locales
- dump your databases, and delete them
- Set any charset related configuration variable in mysql to utf-8
- Import the data, if possible use Encode::Guess to check if the data is utf-8 or latin1. If it is latin1, recode.
- Have fun!
mysql's charset handling is rather mysterious to me, and seemingly fragile, so I advise not to fiddle with it in complicated ways. (But if you find somebody who can tell you how to get it to work, you can use that route, of course). | [reply] |
|
|
Now, how do I alter the mysqld environment: All of the following in the my.cnf fail:
# character-set-client =utf8
# character-set-connection =utf8
# character-set-database =utf8
# set-variable = character_set_filesystem =binary
# set-variable = character_set_results =utf8
# set-variable = character_set_server =utf8
# set-variable = character_set_system =utf8
# set-variable = character_sets_dir =/usr/share/mysql/charsets/
# collation-connection =utf8_general_ci
# collation-database = utf8_general_ci
# collation-server = utf8_general_ci
Update: (THREAD SOLVED)
Ok, let's check your current configuration:
$ mysql
> SHOW VARIABLES;
You probably see lots of "latin1" for variables with a charset_* prefix. These are common defaults.
You might want to change this. Compare the manual to understand what we do now.
You can set the MySQL config variables globally in the my.cnf config file, with:
[mysqld]
...
init_connect="SET collation_connection = utf8_general_ci"
init_connect="SET NAMES utf8"
default-character-set = utf8
character-set-server = utf8
collation-server = utf8_general_ci
[client]
default-character-set = utf8
[mysqldump]
...
default-character-set = utf8
Or you can do this on a "per connection" basis. For example, when you use DBI and MySQL, set mysql_enable_utf8 to a true value via the extra params passed to the driver upon connection.
(This thread is marked as being solved, as I found out, most problems similar to the described one here are coming from the MySQL server being in default/latin1 mode while you try to have everything else in your pipeline in utf8. Changing this last element to utf8 solves it.) | [reply] [d/l] [select] |
|
|
Is is possible to have utf8 data in a utf8 table, but the data is outputtet by mysql as latin1?
actually, that wasn't my intention. But it seems to be the only explanation for the error - so I wanted to know if this is generally possible...
btw: I will now do as advised and set all env variables to utf8.
| [reply] |
Re: Is it perl or MySQL? ü ends up in CGI script
by TedYoung (Deacon) on Feb 08, 2008 at 15:07 UTC
|
For what it is worth, we call this problem "Irritable Vowel Syndrome"
| [reply] |
Re: Is it perl or MySQL? ü ends up in CGI script
by olus (Curate) on Feb 08, 2008 at 14:52 UTC
|
...
use Encode qw ( from_to );
...
my @row = $stmt->fetchrow_array();
map{ from_to($_, "latin1","utf8"); } @row;
...
| [reply] [d/l] |
Re: Is it perl or MySQL? ü ends up in CGI script
by oko1 (Deacon) on Feb 08, 2008 at 14:21 UTC
|
What happens when you do a simple text SELECT query? That'll separate the MySQL::Men from the HTML::Sheep.
| [reply] |
Re: Is it perl or MySQL? ü ends up in CGI script
by isync (Hermit) on Feb 08, 2008 at 16:10 UTC
|
Now I configured my environment right:
| character_set_client | utf8
| character_set_connection | utf8
| character_set_database | utf8
| character_set_filesystem | binary
| character_set_results | utf8
| character_set_server | utf8
| character_set_system | utf8
| character_sets_dir | /usr/share/mysql/charsets/ |
| collation_connection | utf8_general_ci
| collation_database | utf8_general_ci
| collation_server | utf8_general_ci
But the problem persists in a new iteration:
in phpmyadmin: the same ü for ü
but now, also via console # mysql > SELECT ...
I get garbled data. So far it seemed a problem of the output only....
Any ideas?
| [reply] |
|
|
Does that happen for clean UTF-8 data that you inserted after changing the configuration?
| [reply] |
|
|
Yes, but now it works, as can be seen in my update, two more line of configuration make a difference (altough I don't fully understand, what I did...)
| [reply] |