Apparently, if you post unicode characters in a <code> block, you see the numeric HTML entities instead of the characters. I expect this is due to some kind of double encoding bug.
Here are some random unicode characters I randomly selected:
حيض,πβΫ
Here are the same characters in a <pre> block:
حيض,πβΫ
And here are the same characters in a <code> block:
حيض,πβΫ
Note that I didn't use any HTML numeric references myself - I just copy/paste the characters from the gnome "charmap" program into the textfield.
Below is the relevant part of the HTML source to this page:
<p> Here are some random unicode characters I randomly selected: <p> حيض,πβΫ </p> Here are the same characters in a <tt class='inlinecode'><pre></ +tt> block: <p> <pre>حيض,πβΫ</pre> <p> And here are the same characters in a <tt class='inlinecode'><code& +gt;</tt> block: <p> <pre class="code"><div class='codeblock'><tt class='codetext'>&#15 +81;&#1610;&#1590;,&#960;&#946;&#939; </tt></div></pre>
As you can see, in the <code> block, the numeric entitie's & chars are incorrectly escaped. I think is is a pretty serious issue now that perl can handle native utf8 source.
Also, I notice that these characters are also doubly escaped in the textarea field that I'm typing in now (i.e. you can enter unicode chars in the textfield, but at preview, in the textfield, you'll just see a bunch of &#number; entries).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Unicode characters in <code> blocks (browser)
by tye (Sage) on Oct 08, 2006 at 05:22 UTC | |
by ambrus (Abbot) on Oct 08, 2006 at 11:10 UTC | |
by Joost (Canon) on Oct 08, 2006 at 15:23 UTC | |
by ambrus (Abbot) on Oct 08, 2006 at 19:19 UTC |