in reply to Perl's encoding versus UTF8 octets
The question is how \xc3\xa4 is actually represented in your file.
If you write qq|\xc3\xa4|, then Perl interprets the eight characters as two bytes with the hexadecimal values of c3 and a4, respectively. But if you read \xc3\xa4 from a file, this interpretation doesn't take place: These are eight individual ASCII characters. What you can do, of course, is do the interpretation yourself:
use Encode; my $ucode = q/\xc3\xa4/; # note the use of 'q', not 'qq' my $newcode = decode('utf8',$ucode =~ s/\\x([a-fA-F0-9]{2})/chr hex($1 +)/egr);
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Perl's encoding versus UTF8 octets
by Polyglot (Chaplain) on Jan 13, 2021 at 07:06 UTC |