in reply to [not perl] unicode/utf8 in browsers and OS's - where does conversion happen?

But what happens when javascript grabs that input

Could you be more precise? A function that reads from the socket should return bytes. A function that returns the text of an XML node should return decoded chars. It all depends on the interface.

It's a question that's easily answered by trying.

when a user enters data into a form

Just a quicky to hold you until someone comes along with more info...

This is very rough area. IIRC, the agent is suppose to encode the data using the same encoding as the page on which the form resides. However, I remember having a discussion about how it isn't well supported and/or there are issues with the approach.

Of particular interest is that some browsers (possibly all the major ones) will populate a specific field with the encoding when the field is provided in the form. I can't remember what the field is called.

  • Comment on Re: [not perl] unicode/utf8 in browsers and OS's - where does conversion happen?

Replies are listed 'Best First'.
Re^2: [not perl] unicode/utf8 in browsers and OS's - where does conversion happen?
by danmcb (Monk) on Jan 06, 2008 at 12:28 UTC

    what I mean by the javascript thing is:

    var data = myform.mytextarea.value;

    i.e. just grabbing the data out of the textarea directly.

      The javascript should see the characters, and should not see the bytes which represent those characters under utf-8, utf-16, iso-8859-1, ascii, windows-1252 or any other character encoding.