comment on

Code points is an abstraction, it's an internal Perl thing.

What are you talking about? It has nothing to do with Perl. "e" is formed from the code point U+0065, "é" is formed from code point U+00E9 or from code points U+0065 + U+0301, etc. This is defined by The Unicode Consortium, not by Perl.

It must produce a bunch of bytes.

No, the input must be a string of integers in 0..255, which it is. print has no problem storing those as bytes. iso-latin-1 doesn't factor into it.

In which of the following is does print use iso-latin-1?

use utf8;
my $s1 = inet_aton('195.169.195.171');  print($s1);
my $s2 = encode_utf8("éë");             print($s2);
my $s3 = "Ã©Ã«";                        print($s3);
my $s4 = "\xC3\xA9\xC3\xAB";            print($s4);
[download]

The only two possible answers are "all of them" or "none of them", since print can't tell the difference between those strings.

If you claim that iso-latin-1 is used, then you claim that use utf8; produces iso-latin-1. It doesn't. It produces Unicode code points.

That prints garbage instead of 'ç'.

Because the terminal expects bytes of UTF-8, but it got bytes of Unicode code points.

In reply to Re^4: Default encoding rules leave me puzzled... by ikegami
in thread Default encoding rules leave me puzzled... by kzwix

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.