perlmonkdr has asked for the wisdom of the Perl Monks concerning the following question:

Hello guys,

I'm want to use DBD::CSV with utf8 data, the manual say for DBI that i'm need to add the "unicode" value, but don't works. to be more clear i'm doing this:

our $st = DBI->connect('DBI:CSV:f_dir=.'),{ unicode => 1 });

It's something wrong? i'm not add any pragma, i'm test with user utf8; but doesn't works too.

What can i do?

Thk U in advance

Replies are listed 'Best First'.
Re: DBD::CSV with utf8
by graff (Chancellor) on Nov 07, 2007 at 05:52 UTC
    I just installed the current version of DBD::CSV from CPAN (which involved updating my version of DBD::File so that I could install SQL::Statment). I wasn't able to find any mention of the "unicode => 1" attribute setting that you mentioned -- not in DBI, or Text::CSV_XS, DBD::File or DBD::CSV. (Where did you find that, exactly?)

    I tried this little test script, which involves storing a string that includes an Arabic character in each row:

    use strict; use DBI; my $Usage = "$0 "; my $db = DBI->connect("DBI:CSV:"); # use current working directory $db->do("CREATE TABLE csv_test (id INTEGER, name CHAR(16))"); my $sth = $db->prepare("INSERT INTO csv_test (id,name) values (?,?)"); my @strings = ( "one \x{0661}", "two \x{0662}", "three \x{0663}" ); binmode STDOUT, ":utf8"; for (0..$#strings) { printf "inserting %d,%s\n", $_+1, $strings[$_]; $sth->execute( $_+1, $strings[$_] ); } $sth->finish(); $db->disconnect;
    Having run that, I found that the resulting csv_test "database" file did in fact have valid and correct utf8 characters in it, as intended. When I added the following lines to the script and ran it again, I saw the problem:
    $sth = $db->prepare("SELECT * FROM csv_test"); $sth->execute; while( my $row = $sth->fetchrow_arrayref ) { printf "retrieved %d,%s\n", @$row; } $sth->finish;
    The problem was that when the perl script reads the strings back from the "database" file, it has no way of knowing that the strings are utf8. To get it to come out right, I have to add use Encode; to the script, and add the following line just before the printf statement:
    $$row[1] = decode( "utf8", $$row[1] );
    It's sort of a shame not being able to tell perl that the file data should be read as utf8 in the first place; you just have to work around that on your own with Encode. You could do that as part of the fetch:
    my @values = map { decode( "utf8", $_ ) } $sth->fetchrow_array;

      Thk you graff,

      The UNICODE attribute was described here DBD::SQLite, , but you right, this options is only for SQLite module becouse utf8 is supported by default in DBI, i'm finally detect the problem, that was becouse i was read the CSV file with EditPlus, my default editor for any code, the problem is that it isn't recognize utf-8 in default way, when i was opened the file with other editor, just like you say, it's all ok.

      for this problem i was assumed that DBI not recognize it, becouse when i was tried to do the same like you say:

      my @values = $sth->fetchrow_array;

      Not work, but just like you say, the solution of that was:

      my @values = map { decode( "utf8", $_ ) } $sth->fetchrow_array;

      This is a big help to me, i'm very greatful

      Thk U again, Best Regards.