in reply to Byte length

The correct way is to de-unicode the string into bytes, and take the length of that:

sub byte_length { return length pack("C0A*", shift); }

Don't try this on 5.6.0 - like most Unicode things it's probably quite broken there.

Update: Oops, looks like "use bytes" is officially the right way to do it. It's even in the "bytes" man page. You can also use bytes::length() directly if you've loaded the bytes module previously.