Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Need tipps for identifying utf-8 problems with Dancer and MySQL

by Skeeve (Parson)
on May 05, 2014 at 10:52 UTC ( [id://1085033]=perlquestion: print w/replies, xml ) Need Help??

Skeeve has asked for the wisdom of the Perl Monks concerning the following question:

I have a MySQL table containing some string using german Umlauts. As far as I can see from MySQL these strings are stored in UTF-8 and using MySQL workbench they appear correct.

Now I have a Dancer application which is retrieving those strings. config.yml defines the charset to be UTF-8 and in fact returning static strings containing umlauts, either from my templates or from a perl string seems fine.

Just the strings I retrieve from the database get messed upped. Instead of "ü" I get for example "ü".

I hope some of you have some good tipps to identify the source of my problem.


s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Replies are listed 'Best First'.
Re: Need tipps for identifying utf-8 problems with Dancer and MySQL
by McA (Priest) on May 05, 2014 at 12:15 UTC

    Hi,

    just a hint for debugging:

    1) Get sure whether your MySQL installation stores in UTF-8. Take the table in question and do a show create table blabla\G in mysql client. When there is an alternative charset declared you can see it on the last line. Also check via show global variables like '%char%';.

    2.) Check whether the MySQL connect option mysql_enable_utf8 is set to true if you have UTF-8 enabled.

    3.) Insert a debug statement soon after fetching data from the database. If you can put your hand on a string then do the following:

    my $utf8_flag = utf8::is_utf8($string) ? 1 : 0; print STDERR "For String '$string' UTF8-Flag is: $utf8_flag\n";

    When your whole code is running in an UTF-8 environment than you should get a '1' there.

    The output you get is a sign for double encoding. Look for possibilities where this kind of double encoding could happen. Rule of thumb. Don't do encoding until it comes to output at the boundaries.

    Best regards
    McA

      Thanks for the advice on utf8::is_utf8! I stumbled on the same problem, but I am using sqlite3. When reading the string from the database, I can see that it contains the right bytes(*), but the utf8_flag is 0.

      How do I convince Perl that the string from the database is really an utf8 string? I think that I need to open the sqlite database with some option so that all strings read from the database will receive the utf8-flag.

      I tried utf8::upgrade, but it does not work: on the web page the single accented character shows up as 2 accented characters.

      (*)printing to STDERR which is connected to an utf8 terminal shows the correct accented character.

        Hi,

        First of all I don't know sqlite3. There are some players in the game: sqlite3 and DBD::xxx. When the DBD driver for sqlite you use does not decode the byte strings which come from the sqlite database, than you have to do it.

        use Encode qw(decode); my $decoded_string = decode('UTF-8', $byte_string_from_sqlite);

        Which driver 'DBD::xxx' are you using?

        UPDATE: Have a look at http://search.cpan.org/~ishigaki/DBD-SQLite/lib/DBD/SQLite.pm#DRIVER_PRIVATE_ATTRIBUTES. I'm pretty sure that is what you are looking for: sqlite_unicode

        Regards
        McA

Re: Need tipps for identifying utf-8 problems with Dancer and MySQL
by Skeeve (Parson) on May 05, 2014 at 11:07 UTC

    For reference: I removed in my development.yml the on_connect_do (which I copied and pasted from somewhere). They contained:

    on_connect_do: [ "SET NAMES 'utf8'", "SET CHARACTER SET 'utf8'" ]

    Now it seems to work…


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Need tipps for identifying utf-8 problems with Dancer and MySQL
by Anonymous Monk on May 05, 2014 at 11:07 UTC
    How are you letting DBI know to decode these strings as UTF8?

      The fun thing about Dancer is: You do not need to take care. At least that's my impression.


      s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
      +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

        The fun thing about Dancer is: You do not need to take care. At least that's my impression.

        Sounds like wishful thinking

Re: Need tipps for identifying utf-8 problems with Dancer and MySQL
by Skeeve (Parson) on Aug 27, 2014 at 09:19 UTC

    Another Update.

    It turned out that I had to set in my config under "dbi_params:" mysql_enable_utf8: 1 in order for dancer to successfully store my input in my MySQL DB.


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1085033]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-16 20:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found