comment on

it actually is important that the "internal form" is (very much like) utf8 unicode.

I'm not sure I can follow your arguments. Which of those desirable properties wouldn't be possible if Perl had a different internal unicode string representation? Other languages like Java or Python have chosen different internal representations, yet they are perfectly capable of doing regex matches or parsing string literals (analogous to "\x{abcd}" in Perl) into their internal form.

It's just a matter of how things are implemented. Of course, different implementations have different pros and cons with respect to performance (speed/memory) or ease of implementation, but I don't see why utf8 would be required as the internal form to realize the properties you mentioned.

In reply to Re^2: text encodings and perl by Anonymous Monk
in thread text encodings and perl by andal

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.