Thank you for your thorough explanation, tye. You answered my question.
The W3C is doing the right thing. (See 8.2.2.2 Character encodings in the HTML5 working draft specification.) Its willful violation of anachronistic standards for compelling, practical reasons is, IMHO, a practice that is overdue in Perl 5. By now, Perl 5 should also be defaulting to Windows-1252 instead of to ISO 8859-1 (Latin 1). Its failure to do this is one of the little things that make Perl 5 seem old and crufty, especially to Windows programmers. By dogmatically adhering to some misguided commitment to compatibility and portability, Perl 5 violates the principle of least astonishment.
By the way, I had done something like this…
C:\>chcp Active code page: 1252 C:\>type match_test_3.pl #!perl use strict; use warnings; use open qw( :encoding(Windows-1252) :std ); my $pattern = qr/\A\w+\z/; for my $word (@ARGV) { my $result = $word =~ $pattern ? "matches" : "doesn't match"; printf qq/The word "%s" %s the pattern %s\n/, $word, $result, $pat +tern; } C:\>perl match_test_3.pl Tšekissä Žena Œdipus Rex "\x{009a}" does not map to cp1252 at match_test_3.pl line 12. The word "T\x{009a}ekissä" doesn't match the pattern (?^:\A\w+\z) "\x{008e}" does not map to cp1252 at match_test_3.pl line 12. The word "\x{008e}ena" doesn't match the pattern (?^:\A\w+\z) "\x{008c}" does not map to cp1252 at match_test_3.pl line 12. The word "\x{008c}dipus" doesn't match the pattern (?^:\A\w+\z) The word "Rex" matches the pattern (?^:\A\w+\z) C:\>
…before I posted my inquiry here to prove to myself that the problem wasn't just with the use within the Perl source file of Windows-1252 characters in the range from 80 thru 9F.
There's a Feedback button at the bottom of the page http://www.fileformat.info/info/unicode/char/009a/index.htm. ☺
Thanks again.
Jim
In reply to Re^2: Windows-1252 characters from \x{0080} thru \x{009f} (source-code encoding)
by Jim
in thread Windows-1252 characters from \x{0080} thru \x{009f}
by Jim
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |