in reply to Fedora and broken pipe ¦
Whenever you expect to see a non-ASCII character but see a question mark instead, this is a very good clue that the display tool (your browser, for example) is expecting utf8, and is getting some character data that is not parsable as utf8 -- e.g. there are bytes with the eighth bit set that were created/intended as some legacy encoding (e.g. cp1251 or iso-8859-1 or whatever); non-ASCII characters in utf8 must always be conveyed by at least two bytes.
(Another symptom of that same problem is when you expect to see a string of two or more meaningful non-ASCII characters, and instead you see a smaller number of nonsensical non-ASCII characters. But it fairly rare that non-utf8 data happens to fall into a pattern that could be parsed as utf8 without errors, so you usually do see one or more "?" in the mix.)
OTOH, whenever you expect to see a single (meaningful) non-ASCII character but you see a string of two or three nonsensical non-ASCII characters instead, this is a good clue that your display tool is expecting data in a legacy single-byte character set (cp125*, iso-8859-*) and has received utf8 data.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Fedora and broken pipe ¦
by cosmicperl (Chaplain) on Aug 16, 2005 at 22:39 UTC |