Whenever you expect to see a non-ASCII character but see a question mark instead, this is a very good clue that the display tool (your browser, for example) is expecting utf8, and is getting some character data that is not parsable as utf8 -- e.g. there are bytes with the eighth bit set that were created/intended as some legacy encoding (e.g. cp1251 or iso-8859-1 or whatever); non-ASCII characters in utf8 must always be conveyed by at least two bytes.
(Another symptom of that same problem is when you expect to see a string of two or more meaningful non-ASCII characters, and instead you see a smaller number of nonsensical non-ASCII characters. But it fairly rare that non-utf8 data happens to fall into a pattern that could be parsed as utf8 without errors, so you usually do see one or more "?" in the mix.)
OTOH, whenever you expect to see a single (meaningful) non-ASCII character but you see a string of two or three nonsensical non-ASCII characters instead, this is a good clue that your display tool is expecting data in a legacy single-byte character set (cp125*, iso-8859-*) and has received utf8 data.
In reply to Re: Fedora and broken pipe ¦
by graff
in thread Fedora and broken pipe ¦
by cosmicperl
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |