The concept of storing the data as a raw byte stream is what I wanted to accomplish though I'm almost certain that using the Encode functions is not getting me there.

I think you should not have to use the Encode functions at all in order to put the data into the database. I could be wrong, but if you just put the variable(s) containing the utf8 string(s) as the arg(s) you pass to the  sth->execute() call (you are using placeholders, aren't you?), it should do the right thing -- oracle won't know anything about perl's internal ut8 flag, and and doesn't need to know. The string(s) should just go into the table column(s) without further ado.

(The only issue where I might be wrong about that is if your oracle setup happens to behave strangely when given characters in the range 0x80-0x9f; a lot of the utf8 "continuation" (non-initial) bytes are likely to be in this range, and for some interpretations of "ISO-8859", they are either given some sort of special treatment (e.g. "interpreted" as control characters with strange side effects), or else they are not supposed to exist. But I don't think a varchar2 field in oracle is going to be finicky in this way.)

When you query to get data back from the database, you'll need to do something like  $utf8_str = decode( "utf8", $db_string ) to tell perl that the string is utf8 data.


In reply to Re^3: Character encoding fun... by graff
in thread Character encoding fun... by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.