RnC has asked for the wisdom of the Perl Monks concerning the following question:

There's a certain function in a program I wrote which inserts data into a MySQL database using DBI. Here's a snippet:
my $MYSQL_insert = "INSERT INTO $MYSQL_tablename (title, link,descript +ion, pubdate, source) VALUES (?, ? , ?, ?, ?)"; my $query_handle = $dbhandle->prepare($MYSQL_insert); $query_handle->execute($title, $link, $description, pubdate, $source);
It all works good, but I'm having problems when inserting words like "cólon" (it shows up as cólon), or anything with a non-english char such as "nã" or a cedil appears as a bunch of junk. The weird thing is that if I do the INSERT via the MySQL shell using any of these characters, it works fine, which I can check by SELECTing. So that's leading me to believe that DBI is somehow screwing up with the encoding or something. I already tried changing the locale (both from the shell and by using setlocale(), but still the problem persists.

If I print the contents of the variables to the shell, I can notice that perl uses the accents and everything correctly, so it has to be something DBI is doing either on prepare() or execute(). Any ideas on where I can find a fix for that? I've been through DBI's docs a hundred times and I still can't find anything related.

Thanks in advance.

2006-01-09 Retitled by g0n, as per Monastery guidelines
Original title: 'DBI hadling special chars like accented ones'

Replies are listed 'Best First'.
Re: DBI handling special chars like accented ones
by explorer (Chaplain) on Jan 09, 2006 at 10:31 UTC
    You are created the database with ISO-8815-1 codification, but you are inserting utf-8 characters...
Re: DBI handling special chars like accented ones
by Aristotle (Chancellor) on Jan 09, 2006 at 15:17 UTC
      I feel like I'm punching nails here =(

      This is what I've tried so far:
      - The patch suggested by randyk. I downgraded DBD::mysql to version 2.9008 to be able to apply it.
      - rebuilding the database AND the tables setting the default charset to utf8.
      - using the Encode module to encode a string to utf8 similar to what is suggested here.

      None of the above worked, as I'm still getting "junk" when trying to INSERT via DBI. And it has to be either the module or maybe some missing locale setting (I've tried changing it as well with Posix locale_h, but no success), since I can perform INSERTs via the MySQL shell using accents just fine.

      Any help is appreciated.
        Ok, I found out a solution.

        The patch mentioned on the parent post has no effect for this particular case. I rolled back to DBD-mysql-3.0002. Actually, the fix consists of using the utf8 pragma, and then performing an upgrade in the string, like this:

        use utf8; ... my $string = "voilá monsieur"; utf8::upgrade($string); # $string is now ready to be INSERTed by DBI, # but you still need to do this $dbhandle->do("SET CHARACTER SET utf8"); # now you can perform the INSERT
Re: DBI handling special chars like accented ones
by randyk (Parson) on Jan 09, 2006 at 15:12 UTC
      Problem: I'm currently using DBD-mysql-3.0002, and the patch is for version 2.9008 (obviously it failed when I tried to apply).

      Any ideas apart from a possible downgrade?