in reply to Re^5: Mugged by UTF8, this CANNOT be right
in thread Mugged by UTF8, this CANNOT be right

Thank you, ++mje! This is very helpful to know. And it's authoritative, too, as it's coming straight from the maintainer of DBD::ODBC.

Your post prompted me to reread the documentation of DBD::ODBC's Unicode support more carefully. Among the wealth of detailed information about Unicode and various different drivers for different RDBMSes, the documentation does state:

DBD::ODBC uses the wide character versions of the ODBC API and the SQL_WCHAR ODBC type to support unicode in Perl.
Wide characters returned from the ODBC driver will be converted to UTF-8 and the perl scalars will have the utf8 flag set (by using sv_utf8_decode).
perl scalars which are UTF-8 and are sent through the ODBC API will be converted to UTF-16 and passed to the ODBC wide APIs or signalled as SQL_WCHARs (e.g., in the case of bound columns).

I think it might be helpful to have an entry in the DBD::ODBC FAQ like "How Do I Handle Unicode Text With MS Access?" that simply and plainly explains that, mostly, it should Just Work. (Shouldn't it?)

  • Comment on Re^6: Mugged by UTF8, this CANNOT be right

Replies are listed 'Best First'.
Re^7: Mugged by UTF8, this CANNOT be right
by mje (Curate) on Jan 27, 2011 at 17:29 UTC

    On Windows, it should just work (there are a few exceptions for old ODBC 1 and 2 drivers). On UNIX it is harder as the support for unicode in ODBC drivers and ODBC driver managers differs - which is why it is an optional build setting on UNIX. I'll consider a FAQ entry but perhaps more generally than the one you propose or I'll end up adding loads, one per driver/database.