Hi there, I`ve been pondering over a strange problem quite a while now. Here is the situation. I have a MSSQLserver table with a nvarchar column containing strings in all kinds of languages. As an example, there are two rows I query, one containing an i with a ROOF ontop and one wih the same character followed by something unmistakably >8bit ascii, an A with a reverse ROOF. When I query these two lines and show them in a HTML page, the ROOFed I is shown correctly in the first line but not in second. That is, if the encoding in the browser (both IE and firefox) is set to UTF-8; if it is set to Western European, than it is the other way around: the Roofed I shows, and the Unicode character is printed "wide" in two ascii-chars. It seems that perl reads the Roofed-I differently from the db (ODBC driver) or write OUT, in the case of there being a two-byte character behind it or not. It's not the HTML, because the same query in a cmd-box shows the byte-count difference as well. The first ROOFED-I is represented by C4 83, the second by EE. It's even more weird if you see that both rows from the database return true on the regex m/ROOFED-I/... and even: ord() of the first character is in both cases is 238, the roofed-I in ASCII form as it were. In practice this means that Strings containing both 8-byte diacritical characters and >8 bits unicode characters could not be displayed in HTML. I'm sure I am doing somethingh wrong,..... who can help..? Grtz=JP

In reply to Strange behaviour ODBC/Unicode in perl by jpvdv

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.