in reply to Re: Unicode in <code> sections. (160=&nbsp;)
in thread Unicode in <code> sections.

I was seeing variously a hollow or solid square blob symbol wherever the 160 code was, depending on which of "Arndale Mono", "Bitstream Vera Sans Mono", "Code 2000", "Console437", "Courier", "Courier New", "Fixedsys", "HVRaster", "Lucida Console" and "System" fonts I specified to be used for "preformatted text" in the font configuration for Opera.

Having tried accepting the Author stylesheet (node_id=234493 & node_id=204962) and overriding with local settings, I eventually discovered an option that is only available via the view menu (not in the extensive configuration dialog that I spent ages trying every combination of even vaguely related options), View->Encoding->Automatic. At some point, I know not when, I apparently switch this setting off in favour of View->Encoding->Unicode->utf-16. Setting this back to automatic fixed the problem immediately. It also threw away the contents of the reply dialog that I had spent ages typing all the different things I was trying as I went. Which is a good thing because at the end of the day, everything preceding this is just a cover for

CLOSED: USER ERROR

Sorry to have wasted your valuable time.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
  • Comment on Re: Re: Unicode in <code> sections. (160=&nbsp;)

Replies are listed 'Best First'.
Re: Re: Re: Unicode in <code> sections. (160=&nbsp;)
by mvaline (Friar) on May 09, 2003 at 21:26 UTC
    I guess you solved the problem, but since I authored the original node you referenced, I thought I'd just mention that I was copying and pasting from a utf8 encoded document using Internet Explorer on Mac OS 10.2.6. If in fact it stayed encoded in utf8, it is strange that it wouldn't work in utf16.

      As tye identified above, whatever character was encoded as the leading whitespace, by the time it had gone through cut&paste in your browser, transmission to PM, receipt by a perl script, storage in the PM DB, retrievial via a perl script, and being transmitted to my browser, it ended up encoded as ascii 160. Exactly where in the chain the transformation occurred I wouldn't even hazard a guess at.

      Suffice it to say, as character code 160 is illegal as either utf-8 or utf-16, with the browser set to ignore the encoding information in the page and treat everything as utf-16, it correctly displayed the 'unknown character' symbol in its place, which is what I was seeing.

      The fact that some parts of the chain aren't yet set up to handle unicode means that falling back to 8-bit extended-ascii (ANSI?) representation will persist for sometime.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller