I was going to say that a C implementation of the UTF-16 to UTF-8 conversion would be pretty simple and robust -- in fact, you can probably find a C snippet for this at http://www.unicode.org.

But it's true that that if you mistakenly feed random (non-UTF-16) data into this sort of conversion, the result might be worse than just "garbage out".

There are a fair number of "gaps" in the 16-bit space, where Unicode doesn't really have anything defined, as well as some spots that are specifically defined as "not usable characters". And heaven forbid the input data should contain anything in the UTF-16 "Surrogate" range (0xD800-0xDFFF), which is reserved for building "wider" characters using two consecutive 16-bit values (these get rendered into 4-byte utf8 codes, whereas all other UTF-16 code points end up as 1, 2 or 3 bytes in utf8).


In reply to Re^2: import UTF-16 strings in XS by graff
in thread import UTF-16 strings in XS by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.