zby has asked for the wisdom of the Perl Monks concerning the following question:
And here it's output:use Encode qw/decode is_utf8/; use DBI; my $dbh = DBI->connect("dbi:Pg:dbname=ab", "***", "***", { RaiseError +=> 1, AutoCommit => 0 }); my $a = decode('iso8859-1', "\x{92}"); $dbh->do("CREATE TABLE a (a text)"); $dbh->do("INSERT INTO a(a) VALUES (?)",{}, $a); my($b) = $dbh->selectrow_array("SELECT * FROM a"); if($b eq $a){ print "Equals a: $a, b: $b\n"; }else{ print "Not equals a: $a, b: $b\n"; } print "a is_utf8: " , is_utf8($a), "\n"; print "b is_utf8: " , is_utf8($b), "\n"; $dbh->rollback; $dbh->disconnect;
Suprise! The value that we get back is not equal to what we fed to the database. The database is PostgreSQL with UTF8 as the main charset. I am not sure if I did something wrong, or what should be the behaviour of this code snippet, but this struck me as very suprising.Not equals a: ’, b: Â’ a is_utf8: 1 b is_utf8:
Update: Adding _utf8_on($b) sets the UTF8 flag on $b and then indeed $a eq $b.
Update: The perl version is 5.8.3 on Linux (thanks mirod).
Update: After some more meditation on this fact I see that it is not much different from the case when you feed lets say string '5.0' into a numeric field and when you get it back you have '5'. You just need to remember that the UTF8 flag is lost when the value goes into the database.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Beware of the UTF_8 flag
by mirod (Canon) on Apr 22, 2004 at 13:01 UTC | |
|
Re: Beware of the UTF_8 flag
by borisz (Canon) on Apr 22, 2004 at 13:36 UTC | |
|
Re: Beware of the UTF_8 flag
by ysth (Canon) on Apr 22, 2004 at 16:12 UTC |