The <code> blocks are immune from & expansion by design, so you can't just code HTML entities for funny chars.
So... why can't this site do it for us? We could have a <code utf-8> block and a <code Windows> block, etc. The display formatting logic would always turn chars beyond basic ASCII into named entities or Unicode entities, so it displays properly regardless of the browser's setting (or, convert to match what the page's carset is stated to be for characters in that character set).
A variation would be to have some other attribute mark in the opening <code> tag to indicate that some escape character is used in the code block, so we could write such things if we wanted to.
I think a smart default would work, too. If a code block contains characters that are beyond 127 and are legal UTF-8 encodings, it could assume (by default) that it is in fact UTF-8 and convert them to entities. If that's not correct, it would show in the preview window. Getting it wrong is no worse than the current situation with forgetting to escape out square brackets.
I think changing the sent HTML to UTF-8 is not a solution, since we would continue to have both 8-bit characters and UTF-8 pasted into input fields. The solution is to allow either for input.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Text Encoding on this site's HTML
by grantm (Parson) on Dec 24, 2002 at 04:27 UTC | |
by theorbtwo (Prior) on Dec 24, 2002 at 06:26 UTC |