Hi Monks

Please forgive (mea culpa) that this is not really a perl question. But I thought it was an interesting question, and should concern anyone involved with web programming, in any form. And there are smart people here who might even know the answer.

The question is this : we all know that unicode should generally get transmitted over the wire by the browser as UTF8 (provided that the form is setup correctly and so on). But what happens when javascript grabs that input and does some AJAX tomfoolery with it? SHould the javascript see the input already converted to UTF8, or unicode? Or is the answer "not defined"?

And even more tricky - when a user enters data into a form by unknown method (they could but using MS regional options to specify the keyboard type, for instance Turkish Q, or they might use special software to input Devnagari, hopefully as UTF8, or they may just cut paste from god knows where) - what should/does the OS (almost always Windows) do? COnvert into UTF8 because the form wants it? Just cut/paste and let the browser sort it out?

This seems fraught with issues because you cannot really tell that a string is a UTF8 string just by looking at it. (You *might* be able to tell that it is *not* one ...)

If anyone can assist my poor addled brain, which really shouldn't be dealing with this right after 2 days of flu, I will become most eternally unjustifiable about it all. Thanks.


In reply to [not perl] unicode/utf8 in browsers and OS's - where does conversion happen? by danmcb

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.