in reply to Perl Mysql and UTF8

I was not able to replicate your results. I loaded utf8 text data from a file into a table and got it back unharmed, using both the mysql command line utility and a perl DBI script.

The one difference between my Perl/DBI attempt and yours might be that I prepared the insert (or update) statement with a placeholder where the text value would go, rather than trying to build the statement with the actual text value in place and trying to quote it somehow -- that is:

my $db = DBI->connect( blah, blah ); my $sql = $db->prepare( "insert into my_table (col1,col2) values (?,?) +" ); while (<>) { # assume we are reading rows of data... chomp; my ( $v1, $v2 ) = split( /\t/ ); # ... tab-delimited $sql->execute( $v1, $v2 ); } $sql->finish; # now try reading stuff back: $sql = $db->prepare( "select col1,col2 from my_table" ); $sql->execute; # (update: forgot to put this in at first) my $rowref = $sql->fetchall_arrayref; for my $row ( @$rowref ) { print join( "\t", @$row ), $/; } $sql->finish;
As a strange little aside: you may need to be careful in your perl script about setting the character semantics on whatever file handles you use for input and output. If both are supposed to be utf8, then I suggest being explicit about that in your perl code (e.g. include  binmode(STDOUT,":utf8"); if you're writing stuff to STDOUT).

As for controlling character semantics in the database transactions, in general I'd say don't -- data is data as far as the RDBMS is concerned (mysql or other), and whatever byte sequence you put in, that's what you'll get back, so long as you use DBI's placeholder / parameter syntax for putting data values into the sql statements.