Your database may be using UTF8, but if your clients are anything like ours (windows), they are probably connecting with the client charater set set to:

NLS_LANG=JAPANESE_JAPAN.JA16SJIS

Which means the Oracle client is translating the characters to UTF8 when it sends them to the database and back to SJIS when it retrieves them. Windows still doesn't have a decent UTF8 IME for Japanese, at least according to our Japanese users.

For my Japanese web interface, I connect to the database using the same NLS_LANG setting, and treat the data as binary data. Oracle gives me SJIS data, perl doesn't change it, and the clients get the SJIS characters they are expecting.

If you really are using UTF8 on the client, you should be able to connect with your client character set as NLS_LANG=(whatever).UTF8 and get UTF8 from the database. But I would make darn sure that your clients really doing that. (Everyone here said we were using UTF8(*), but when I checked the actual windows machines, they were using JA16SJIS, WE8ISO8859P1, and other local character sets(**) on the clients.)

Endnotes

(*) Your server is probably using UTF8 as the character set, which is important because Oracle can translate any other character set to UTF8 and back losslessly. But your client character set is probably whatever your client systems can display most easily.

(**) Additional cautions on Oracle client character sets: the NLS_LANG environment variable is the only way to set it, and there can only be one character set per client process. I have to start completely separate apache servers in order to handle multiple client character sets.


In reply to Re: UTF-8, Oracle and Perl life by sharkey
in thread UTF-8, Oracle and Perl life by Akira71

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.