in reply to DBI, unicode, ms sql server

search DBD::ODBC for utf/unicode and read

Replies are listed 'Best First'.
Re^2: DBI, unicode, ms sql server
by zod (Scribe) on Feb 08, 2009 at 03:38 UTC
    Yeah, I read that already. The gist of it seems to be that unicode is enabled by default on Windows with the latest version, but that utf-8 will be converted to utf-16.

    i.e. : "perl scalars which are UTF-8 and are sent through the ODBC API will be converted to UTF-16 and passed to the ODBC wide APIs or signalled as SQL_WCHARs (e.g., in the case of bound columns)."

    I didn't see anything in the docs about how to actually get around that, however.

      Don't inline Unicode values (in fact, don't inline ANY values) in your SQL. Always use placeholders, i.e. my $sth=$dbh->prepare("insert into XXX values (?,?)"; $sth->execute("\x{2345}\x{3333}","\x{4711}"); $sth->finish(); or $dbh->do("insert into XXX values (?,?)",undef,"\x{2345}\x{3333}","\x{4711}");. Inline values prevent SQL caching, inline values open security holes (SQL injection), inline values break Unicode.

      See also the second paragraph under "Problems" in DBD-ODBC-1.18/README.af

      Alexander Foken

        Thanks for the info.
      How do you verify it inserts garbage? MS SQL Server 2008 doesn't support utf8, only utf16, so either DBD::ODBC does the conversion to/from incorrectly, or something else :D
        I verify the garbage by selecting from the table in SQL Server Management Studio. I can insert utf-8 via SSMS without a problem. So it is something related to the ODBC driver through perl. I've also been able to bulk insert utf-8 text from MySQL into this SQL Server db via SSIS -- so it isn't on the sql server end of things.

        So, I'm stumped at the moment...