:utf8 doesn't fix bad data.
$ perl -MDevel::Peek -we'my $buf = "\x80"; open my $fh, "<:utf8", \$bu +f or die; my $x = <$fh>; Dump $x;' utf8 "\x80" does not map to Unicode at -e line 1, <$fh> line 1. SV = PV(0x814fbb4) at 0x814f6cc REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8) PV = 0x8170a78 "\200"\0Malformed UTF-8 character (unexpected continu +ation byte 0x80, with no preceding start byte) in subroutine entry at + -e line 1, <$fh> line 1. [UTF8 "\x{0}"] CUR = 1 LEN = 80
:encoding(UTF-8) replaces the bad data (with the 4 chars '\x80' in this case).
$ perl -MDevel::Peek -we'my $buf = "\x80"; open my $fh, "<:encoding(UT +F-8)", \$buf or die; my $x = <$fh>; Dump $x;' utf8 "\x80" does not map to Unicode at -e line 1. SV = PV(0x814fbb4) at 0x814f6cc REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8) PV = 0x8197048 "\\x80"\0 [UTF8 "\\x80"] CUR = 4 LEN = 80
I can't answer your other questions.
In reply to Re: :utf8 I/O layer vs encoding(UTF8), segfault and speed
by ikegami
in thread :utf8 I/O layer vs encoding(UTF8), segfault and speed
by mje
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |