in reply to UTF8 and Postgresql

I have a feeling that something is broken with the DBD::Pg driver. If it was Unicode clean, you shouldn't have this problem. Clearly your database is set up to store text as UTF8, and the perl side certainly knows about Unicode, so why the disconnect ???

One solution would be to simply remove all non-ASCII characters before insertion:

$text =~ s/[^\0-\x7f]//g;

Another potential option (if you have administrator privs for the db) is to change the client_encoding to something like LATIN1.

Yet another possibility is to convert your text to UTF8:

use Encode; ... my $bytes = encode('utf8', $text); # insert $bytes instead of $text
I think this last solution has a good chance of working if my theory of what's happening is correct. You'll have to check what you get out of the database when you read the data back - hopefully it will be the same as $text. If it isn't, try decoding it using decode('utf8', ...).

Replies are listed 'Best First'.
Re^2: UTF8 and Postgresql
by Anonymous Monk on Apr 30, 2008 at 23:05 UTC
    Thanks, everyone! The encode() worked perfectly.