http://qs1969.pair.com?node_id=1172234


in reply to Re^8: BUG: code blocks don't retain literal formatting -- could they?
in thread BUG: code blocks don't retain literal formatting -- could they?

I should have been more precise.

The PM website uses "windows-1252".1

The web browser will interpret the byte stream as windows-1252 characters. And even if UFT8 encoding were used, the character set is still windows-1252.2

Therefore, simply not encoding non-ANSI characters (within code tags) into HTML entities would not work.

Update: Apparently, the HTML entity encoding takes place in the web browser: Re^3: Strange letters ... (clients)

In theory, this encoding could be reversed, but would still be only a part of the problem.

---

1"windows-1252" is a superset of ANSI that includes some characters needed for some Western European languages. (It is also a superset of ISO-8859-1 (aka "latin-1").)

2"UTF8" encoding is not specific to Unicode. All it really is is a specification for encoding a 32 bit value in to a variable length string of bytes.