UTF-8 is a variable-length (1 to 6 bytes, current character allocations require 4 bytes)... #### As an example, when Perl sees $x = chr(400), it encodes the character in UTF-8 and stores it in $x. Then it is marked as character data, so, for instance, length $x returns 1. However, in the scope of the bytes pragma, $x is treated as a series of bytes - the bytes that make up the UTF8 encoding - and length $x returns 2: #### $x = chr(400); print 'Length: ', length $x, qq~\n~; { use bytes; print 'Length (bytes): ', length $x, qq~\n~; } #### Length: 1 Length (bytes): 2