Re^2: The Queensrÿche Situation

Replies are listed 'Best First'.
Re^3: The Queensrÿche Situation by LanX (Saint) on Oct 19, 2014 at 18:21 UTC
As I can see from the German WP page does 255 (FF) represent the Latin 1 code. And google is your friend http://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=0x&unicodeinhtml=hex C3 BF is utf 8. Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :)}	[reply]
Re^4: The Queensrÿche Situation by Rodster001 (Pilgrim) on Oct 19, 2014 at 18:42 UTC
Right. So #1 is utf-8. Then #2 is utf-16? So then why does this: `use utf8; my $string = "Queensrÿche"; no utf8;` [download] Produce this: `81 Q Q 117 u u 101 e e 101 e e 110 n n 115 s s 114 r r 255 {ff} 99 c c 104 h h 101 e e - this is utf8` [download] When this: `#use utf8; my $string = "Queensrÿche"; #no utf8;` [download] Produces this: `81 Q Q 117 u u 101 e e 101 e e 110 n n 115 s s 114 r r 195 191 99 c c 104 h h 101 e e - this is NOT utf8` [download] If the two bytes are "there", why is "use utf8" yielding a dec 255 for the "ÿ" which is not valid utf8? "The first 128 characters (US-ASCII) need one byte. The next 1,920 characters need two bytes to encode." - http://en.wikipedia.org/wiki/UTF-8	[reply] [d/l] [select]
Re^5: The Queensrÿche Situation by Jim (Curate) on Oct 19, 2014 at 20:02 UTC
Right. So #1 is utf-8. Then #2 is utf-16? No, #2 is ISO-8859-1, which is also known as Latin 1. As it happens, it's also Windows-1252, which today is really a quasi-superset of ISO-8859-1. Neither ISO-8859-1 nor Windows-1252 are Unicode at all, so #2 is not in any Unicode character encoding scheme such as UTF-16. The character encodings ISO-8859-1 (Latin 1) and Windows-1252 are often referred to as "legacy encodings," especially vis-à-vis Unicode.	[reply]