in reply to Fedora and broken pipe ¦

You need to learn about controlling character-set (encoding) selection in your browser, in your cgi script(s), and in whatever text editor you're using, and you probably want to look at how to specify what character set you're actually using when you generate HTML content from a perl cgi script, so that when a browser receives the content, it will know (because you've told it) how to display it correctly.

Whenever you expect to see a non-ASCII character but see a question mark instead, this is a very good clue that the display tool (your browser, for example) is expecting utf8, and is getting some character data that is not parsable as utf8 -- e.g. there are bytes with the eighth bit set that were created/intended as some legacy encoding (e.g. cp1251 or iso-8859-1 or whatever); non-ASCII characters in utf8 must always be conveyed by at least two bytes.

(Another symptom of that same problem is when you expect to see a string of two or more meaningful non-ASCII characters, and instead you see a smaller number of nonsensical non-ASCII characters. But it fairly rare that non-utf8 data happens to fall into a pattern that could be parsed as utf8 without errors, so you usually do see one or more "?" in the mix.)

OTOH, whenever you expect to see a single (meaningful) non-ASCII character but you see a string of two or three nonsensical non-ASCII characters instead, this is a good clue that your display tool is expecting data in a legacy single-byte character set (cp125*, iso-8859-*) and has received utf8 data.

Replies are listed 'Best First'.
Re^2: Fedora and broken pipe ¦
by cosmicperl (Chaplain) on Aug 16, 2005 at 22:39 UTC
    Hi Graff,
       Your right. All browsers we defaulting to UTF-8 even though the page had a meta tag for ISO-8859-1. When I compaired the apache httpd.conf files for RH9 and Fedora I saw that fedor defaulted to UTF-8 while RH9 went for the standard ISO-8859-1. I updated the apache config and restarted and all is now fine. Thanks for pointing me in the right direction.
    AddDefaultCharset UTF-8
    changed to
    AddDefaultCharset ISO-8859-1

    Lyle