(tye)Re3: Unicode source code problem in 5.6.1

Hmm. I guess that might work much of the time. Of course, the code is displayed incorrectly.

When you download the code, you should get the correct byte stream but tagged as Latin-1. If the code is saved in a UTF-8-aware file system (since you are trying to write code in UTF-8), the bytes would be converted from Latin-1 to UTF-8 which would give you different bytes. Even if you save the code using only one-byte characters, translation could happen because the browser knows the operating system expects results in something besides Latin-1, like an OEM encoding (such as "code page 437" in Windows).

I'd think that most current "save as" operations would just save bytes and ignore encodings so you'd get the desired byte values. But I wouldn't bet on that.

- tye

Comment on (tye)Re3: Unicode source code problem in 5.6.1

Replies are listed 'Best First'.
Re: (tye)Re3: Unicode source code problem in 5.6.1 by John M. Dlugosz (Monsignor) on Nov 18, 2002 at 21:33 UTC
I agree, a cut&paste will probably make it worse, not copy the actual byte stream. But, I think telling IE to display the page as UTF8 it overrides the charset setting. That's what it's supposed to do: take the existing byte stream from the server unchanged, but use the interpretation I specify since I presumably know better than the page author. This presumably changes its mind about the encoding, simply overriding any other way of making the determination. So when I "copy" selected text from the browser window, it knows its UTF8 when it copies it to the clipboard and marks it accordingly, or converts to UTF-16 itself and puts that on the clipboard. That means that a Paste should work properly.	[reply]

Replies are listed 'Best First'.

Re: (tye)Re3: Unicode source code problem in 5.6.1
by John M. Dlugosz (Monsignor) on Nov 18, 2002 at 21:33 UTC

This presumably changes its mind about the encoding, simply overriding any other way of making the determination. So when I "copy" selected text from the browser window, it knows its UTF8 when it copies it to the clipboard and marks it accordingly, or converts to UTF-16 itself and puts that on the clipboard. That means that a Paste should work properly.

[reply]