Re^6: BUG: code blocks don't retain literal formatting -- could they?

Replies are listed 'Best First'.
Re^7: BUG: code blocks don't retain literal formatting -- could they? by RonW (Parson) on Sep 19, 2016 at 17:42 UTC
But, this would require PM to send proper, UTF8 encoded response content back to browsers. `Why?` So the Content-type: header will have the correct charset= and encoding= attributes.	[reply]
Re^8: BUG: code blocks don't retain literal formatting -- could they? by perl-diddler (Chaplain) on Sep 20, 2016 at 08:33 UTC
I'm pretty sure that whatever PM sense (proper UTF8 encoded responses or whatever), have no effect on what the Content-type has for charset and encoding attributes. A website can set the Content-type charset and encoding attribs to whatever. That is independent of what they send in the content stream. I.e. Doing one doesn't force the other. They can both be done independently -- however, having them in agreement might be less confusing to some browsers. Likely what is so, is that those who are interested in UTF8 set their browsers to assume that encoding for pages that don't declare an encoding since many HTML4 websites that don't declare encoding still use UTF8 on their website -- whether by intention or by users typing in UTF8 strings that later get displayed to others. I.e. when we use UTF8, most of us see it properly as UTF8 chars on our browsers, already. What is at issue is that the code blocks convert such things into html-enties when it scans our input into the site, but it doesn't convert them on output because they are in code blocks. The bug is that they are converted into HTML-entities in the first place. Too bad no one is interested in fixing this. I guess they went AWOL... ;-)	[reply]
Re^9: BUG: code blocks don't retain literal formatting -- could they? by choroba (Cardinal) on Sep 20, 2016 at 10:57 UTC
> What is at issue is that the code blocks convert such things into html-enties when it scans our input into the site, but it doesn't convert them on output because they are in code blocks. No, that's not what happens. Higher unicode characters are converted to entities always , not only in code blocks. The problem is they're displayed correctly outside of code blocks on output, but incorrectly inside: because PerlMonks doesn't parse the nodes and doesn't render code blocks differently to other parts. That makes the fix complicated: the site would have to start parsing the content of the nodes. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l]
Re^9: BUG: code blocks don't retain literal formatting -- could they? (browser) by tye (Sage) on Sep 21, 2016 at 01:21 UTC
The bug is that they are converted into HTML-entities in the first place. Too bad no one is interested in fixing this. I guess they went AWOL... ;-) Perhaps you should switch to a browser that still has some people doing development work on it? Yes, what is generating HTML entities is your browser, as I explained in Re: Unicode characters in <code> blocks (browser). (Update: And you were already told this in this thread.) The rest of what you wrote above is almost scary in its encouraging unwise and fragile practices. - tye	[reply]
Re^10: BUG: code blocks don't retain literal formatting -- could they? (browser) by perl-diddler (Chaplain) on Sep 21, 2016 at 09:14 UTC
Re^9: BUG: code blocks don't retain literal formatting -- could they? by hippo (Bishop) on Sep 20, 2016 at 09:50 UTC
Too bad no one is interested in fixing this. Good news, everyone! Looks like we have a volunteer.	[reply]
Re^10: BUG: code blocks don't retain literal formatting -- could they? by perl-diddler (Chaplain) on Sep 20, 2016 at 19:52 UTC
Re^11: BUG: code blocks don't retain literal formatting -- could they? by hippo (Bishop) on Sep 20, 2016 at 22:52 UTC
Re^9: BUG: code blocks don't retain literal formatting -- could they? by Your Mother (Archbishop) on Sep 20, 2016 at 20:59 UTC
I guess they went AWOL While you may be summarily executed for this in some realms, you don’t actually need anyone’s permission to leave in open source. This unheard of freedom in human interaction is part of why it attracts so many bright persons to do so much hard work for so little financial gain. Never slight it, not even implicitly.	[reply]
Re^9: BUG: code blocks don't retain literal formatting -- could they? by RonW (Parson) on Sep 20, 2016 at 18:59 UTC
Too bad no one is interested in fixing this. I guess they went AWOL PM's code base dates back the 1990s. http://everything2.com/title/Everything+Engine Granted, some of the issues with code tag processing could have been dealt with in the early days of PM, however, the limitations of the windows-1252 character set did not become a problem until years later. Unfortunately, getting the PM website to handle Unicode/UTF8 is much more complicated than adding `use feature 'unicode_strings';` statements to the code.	[reply] [d/l]
Re^10: BUG: code blocks don't retain literal formatting -- could they? by perl-diddler (Chaplain) on Sep 20, 2016 at 20:04 UTC
Re^11: BUG: code blocks don't retain literal formatting -- could they? by RonW (Parson) on Sep 20, 2016 at 22:50 UTC
Some notes below your chosen depth have not been shown here
Re^9: BUG: code blocks don't retain literal formatting -- could they? by RonW (Parson) on Sep 20, 2016 at 18:37 UTC
I should have been more precise. The PM website uses "windows-1252".¹ The web browser will interpret the byte stream as windows-1252 characters. And even if UFT8 encoding were used, the character set is still windows-1252.² Therefore, simply not encoding non-ANSI characters (within code tags) into HTML entities would not work. Update: Apparently, the HTML entity encoding takes place in the web browser: Re^3: Strange letters ... (clients) In theory, this encoding could be reversed, but would still be only a part of the problem. --- ¹"windows-1252" is a superset of ANSI that includes some characters needed for some Western European languages. (It is also a superset of ISO-8859-1 (aka "latin-1").) ²"UTF8" encoding is not specific to Unicode. All it really is is a specification for encoding a 32 bit value in to a variable length string of bytes.	[reply]


go ahead... be a heretic
	PerlMonks