in reply to Re^3: Best Way to Get Length of UTF-8 String in Bytes?
in thread Best Way to Get Length of UTF-8 String in Bytes?
I see use bytes; without any utf8::upgrade or utf8::downgrade, and that usually indicates code that suffers from "The Unicode Bug".
sub bytelen(_) { require bytes; return bytes::length($_[0]); }
should be
sub utf8len(_) { utf8::upgrade($_[0]); require bytes; return bytes::length($_[0]); }
Or the same without bytes:
sub utf8len(_) { utf8::upgrade($_[0]); Encode::_utf8_off($_[0]); my $utf8len = length($_[0]); Encode::_utf8_on($_[0]); return $utf8len; }
Update: Added non-bytes alternative.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Best Way to Get Length of UTF-8 String in Bytes?
by tchrist (Pilgrim) on Apr 24, 2011 at 06:01 UTC | |
by Anonymous Monk on Apr 24, 2011 at 06:04 UTC | |
by ikegami (Patriarch) on Apr 24, 2011 at 06:06 UTC |