sub is_utf8 { utf8::upgrade( $_[0] ); return 1; }
But seriously,
sub is_utf8 { my $s = "\x80" . $_[0]; my $internal = unpack "p", pack "p", $s; return $s ne $internal; }
Tested in 5.8.0, 5.8.8 and 5.10.0 using:
utf8::downgrade( my $empty_dn = '' ); # 0 utf8::upgrade( my $empty_up = '' ); # 1 utf8::downgrade( my $ascii_dn = 'a' ); # 0 utf8::upgrade( my $ascii_up = 'a' ); # 1 utf8::downgrade( my $hibit_dn = chr(0xC9) ); # 0 utf8::upgrade( my $hibit_up = chr(0xC9) ); # 1 utf8::upgrade( my $wide_up = chr(0x2660) ); # 1 for ( $empty_dn, $empty_up, $ascii_dn, $ascii_up, $hibit_dn, $hibit_up, $wide_up, ) { print is_utf8($_)?1:0, "\n"; }
Why do you need to know?
Update: Added test code.
In reply to Re: good way to implement utf8::is_utf8 for perl 5.8.0
by ikegami
in thread good way to implement utf8::is_utf8 for perl 5.8.0
by perl5ever
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |