comment on

You need to learn about controlling character-set (encoding) selection in your browser, in your cgi script(s), and in whatever text editor you're using, and you probably want to look at how to specify what character set you're actually using when you generate HTML content from a perl cgi script, so that when a browser receives the content, it will know (because you've told it) how to display it correctly.

Whenever you expect to see a non-ASCII character but see a question mark instead, this is a very good clue that the display tool (your browser, for example) is expecting utf8, and is getting some character data that is not parsable as utf8 -- e.g. there are bytes with the eighth bit set that were created/intended as some legacy encoding (e.g. cp1251 or iso-8859-1 or whatever); non-ASCII characters in utf8 must always be conveyed by at least two bytes.

(Another symptom of that same problem is when you expect to see a string of two or more meaningful non-ASCII characters, and instead you see a smaller number of nonsensical non-ASCII characters. But it fairly rare that non-utf8 data happens to fall into a pattern that could be parsed as utf8 without errors, so you usually do see one or more "?" in the mix.)

OTOH, whenever you expect to see a single (meaningful) non-ASCII character but you see a string of two or three nonsensical non-ASCII characters instead, this is a good clue that your display tool is expecting data in a legacy single-byte character set (cp125*, iso-8859-*) and has received utf8 data.

In reply to Re: Fedora and broken pipe Ś by graff
in thread Fedora and broken pipe Ś by cosmicperl

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.