Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

We have recently upgraded our system (Debian woody) from Perl 5.6.1 to Perl 5.8. We have a content management system that makes heavy use of XML. However, there are occasions where we want to put iso in our mysql database, and sometimes we want XML. We have worked around some caveats when upgrading, but this one is a miracle for me:

I have a CGI script for a user interface to update some database stuff. This is a generig script which gets an XML config file where the database tables and fields to update/insert are defined. This file is processed by XML::Parser, and the web form and insert/update/select statements are thus dynamically constructed (in a rather comlicated way). In this case we want ISO in the database.

However, there is the following problem: In 5.6.1 everything worked fine, but now with 5.8. we unexpectedly get unicode in the database when we use DBI!

Like this:
use FW::Database; use XML::Parser; . .# a lot of stuff that constructs $stmt . DB_insertdata($stmt, $dbh);
where DB_insertdata is (among other things) in FW::Database:
package FW::Database; require Exporter; use vars qw(@ISA @EXPORT %conf); @ISA = qw(Exporter); @EXPORT = qw( &DB_insertdata ); sub DB_insertdata { my $stmt = $_[0]; my $dbh = $_[1]; my $error = ""; my $sth = $dbh->prepare($stmt) || FW::Utils::handle_dberror("$DBI: +:errstr", $dbh); my $rv = $sth->execute() || FW::Utils::handle_dberror("$DBI::errst +r", $dbh); }
When printing $stmt either into the browser or via STDERR in the error.log everything is ISO, but when the same string is inserted via DBI I get unicode in the database! So, the current urgency workaround is forget agbout DBI and replacing DB_insertdata with
open (PIPE, "|/usr/local/mysql/bin/mysql -u user database"); print PIPE $stmt; close(PIPE);
which works but is of course very far away from being acceptable code.

I am sorry that this got so lengthy. Nevertheless I hope that anyone can provide clues to that. thank you Karlheinz

Replies are listed 'Best First'.
Re: Perl 5.8, DBI and unicode
by CountZero (Bishop) on Dec 23, 2002 at 14:59 UTC

    I don't have Perl 5.8, so my views may perhaps not be very helpful, but still:

    • Did you check the readme with changes from Perl 5.6.1. to 5.8? Does it say something special about input and output encodings?
    • Is perhaps the data you put into the database already in Unicode format before it is processed by your script? Perhaps some other things were changed in your set-up beside Perl?

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      Good thoughts. I do have Perl 5.8, but I don't know the answer to your question. However, I did look in the perldocs and found the "perlunicode" section, accessible by doing
      perldoc perlunicode
      at a command prompt. You might find some useful info in there.

      If you don't find a good answer here, you should post your question on the Perl DBI mailing list - I'm pretty sure someone would be able to help you there.
        I have read the docs, especially perldoc unicode
        But i must confess that it's true that I don't think that I have completely understood Perl's unicode model...

        What makes me clueless here is that when i print the SQL statement, either to the browser or apache log, it shows fine ISO. But then it ends up as unicode in the database. Working explicitly with Encode::decode_utf8 on the data also had no effect.