way has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys!

I've problems with DBI and the encoding, i'm using MySql with all %char% variables as utf-8, so using placeholder i try to insert some field but it retrieve error like this:

DBD::mysql::st execute failed: Incorrect string value: '\xEDnes' for c +olumn 'lastname' at row 1 at

I know that it is basic but don't know where start, can you give me some idea?

Regars

--------------------------------

SOLVED:

Don't forget the difference between utf8 and UTF-8

binmode STDIN, ':encoding(UTF-8)';

This's becouse i receive data from a form and insert it to the MySql DB

Replies are listed 'Best First'.
Re: DBI and encoding problem
by almut (Canon) on Feb 21, 2009 at 10:11 UTC

    Looks like your input data isn't really in UTF-8 encoding...  In UTF-8, \xED would start a multibyte sequence, which may not be followed by 'n' (i.e. \x6E, a byte without the 8th/high bit set), as it is here. This is simply invalid UTF-8 encoding, which is presumably why MySql complains.

    It's hard to tell more without knowing what your input data originally was and how it's being processed in your code...

Re: DBI and encoding problem
by graff (Chancellor) on Feb 21, 2009 at 15:50 UTC
    Do you execute this sql statement when you first establish your connection to the database?
    $dbh->do( "set names utf8" );
    And where do the string values come from that you are using as parameters for the placeholders in your query? Are they flagged in your perl script as utf8 strings (because they come from a ":utf8"-mode file handle or from an Encode::decode("utf8",...) call)?

    I'm actually grateful to you for posing this question, because it led me to realize that perl's "special" handling of unicode characters in the \x{80} - \x{ff} range seems to require that you do some "special" handling when passing strings to mysql. Here's a test script:

    The point is that perl still uses a single byte internally for characters in the original the Latin-1 range (from iso-8859-1), and even when perl has flagged a latin-1 string as being utf8 data, you should "encode" it into an external (true multibyte) form before sending it to the database.

    (Try changing the "prep" and "deprep" subs in the test script, to skip the "encode" and "decode" calls, and you'll see a difference in the results.)

    update: added "use utf8" to the test script, which did not change its behavior, but might help clarify the issue.

      Thank for your response, I yes changed the client connection to UTF-8, i update this post with my solution, based in yours response.