Re^2: Unicode in code (pointless)

I agree with those statements as well. I also agree with the person who replied years ago and pointed out that this wouldn't fix anything since node text is stored in the DB in Latin-1 and CODE tags don't have any reserved characters for escaping things so you still wouldn't be able to get non-Latin-1 characters to work in code tags, even if you modified the "form submit" to make it possible to submit utf-8¹ text w/ non-Latin-1 characters inside of a CODE block (and even if the database wasn't using Latin-1, then you'd still have to change the page generation in order to display non-Latin-1 characters in CODE blocks correctly).

¹ Actually it is already possible to submit utf-8 to PerlMonks and some browsers already do this. It is just that some browsers choose (correctly, at this time) not to (and that PerlMonks probably still doesn't detect when that happens, in part because it would have to guess because the HTTP protocol neglected to specify a header requirement for stating the encoding used in a POST).

- tye

Comment on Re^2: Unicode in code (pointless)

Replies are listed 'Best First'.
Re^3: Unicode in code (pointless) by TimToady (Parson) on May 02, 2008 at 17:44 UTC
Whether or not us uniphiles continue to harp on this issue, it's going to become more and more of a FAQ as time goes on. Maybe someone should apply for a TPF grant to drag the DB kicking and screaming into 3rd millenium.	[reply]
Re^3: Unicode in code (pointless) by John M. Dlugosz (Monsignor) on May 03, 2008 at 14:43 UTC
So the logic in presenting CODE will always escape out everything that would be interpreted in XML source, and only stores 8-bit text presumably to be interpreted as Latin-1. Maybe the work-around is to add an escape character to that. I wonder if you could find out if there are some chars that never appeared in any CODE block ever. If you interpreted as Windows 1252, then you could have the french quotes used in Perl 6. But I prefer to stake out the control codes as escapes.	[reply]