- or download this
UTF-8 is a variable-length (1 to 6 bytes, current character allocation
+s require 4 bytes)...
- or download this
As an example, when Perl sees $x = chr(400), it encodes the character
+in UTF-8 and stores it in $x. Then it is marked as character data, so
+, for instance, length $x returns 1. However, in the scope of the byt
+es pragma, $x is treated as a series of bytes - the bytes that make u
+p the UTF8 encoding - and length $x returns 2:
- or download this
$x = chr(400);
print 'Length: ', length $x, qq~\n~;
...
use bytes;
print 'Length (bytes): ', length $x, qq~\n~;
}
- or download this
Length: 1
Length (bytes): 2