comment on

utf8::is_utf8($string) will tell you whether a string is stored as utf8 characters or single byte characters (no require/use needed). And utf8::upgrade($string) will convert a string stored as single byte characters to being stored as utf8. But that's not usually what you want; you want a layer on the filehandle that will convert whichever form is being output to utf8 (or whatever other encoding you choose). You can set this with open or after the file is opened with binmode.

But some actual sample code/data would be very helpful; when you say "writing two characters for every Unicode byte" it makes me think you have some misconceptions that we could help clear up.

In reply to Re: How to set the UTF8 flag? by ysth
in thread How to set the UTF8 flag? by dissident

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.