Around 5.6.1, length() started returning the length of a string in characters instead of the length in bytes. This now means that length() called on a multi-byte UTF-8 string will return a smaller number under 5.6.1 on up then it would under previous versions of Perl.
Fortunately, along with Unicode support came the nifty bytes pragma, which can be used to force length() to return the length of a scalar in bytes like it used to. Unfortunately, pre-5.6 versions of Perl don't have bytes.pm, so this routine was born. The trick to enable you to use bytes regardless of whether or not it's present was the work of Liz:
...BEGIN { # this hack allows us to "use bytes" or fake it for older (pre-5.6 +.1) # versions of Perl (thanks to Liz from PerlMonks): eval { require bytes }; if ($@) { # couldn't find it, but pretend we did anyway: $INC{'bytes.pm'} = 1; # 5.005_03 doesn't inherit UNIVERSAL::unimport: eval "sub bytes::unimport { return 1 }"; } }
sub size_in_bytes ($) { use bytes; return length shift; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Portable length() in bytes.
by ysth (Canon) on Nov 07, 2004 at 20:20 UTC | |
by William G. Davis (Friar) on Nov 07, 2004 at 21:11 UTC | |
by ysth (Canon) on Nov 07, 2004 at 21:55 UTC | |
by William G. Davis (Friar) on Nov 07, 2004 at 23:19 UTC | |
by ysth (Canon) on Nov 07, 2004 at 23:34 UTC | |
| |
by thor (Priest) on Nov 07, 2004 at 20:44 UTC | |
by ysth (Canon) on Nov 07, 2004 at 22:00 UTC | |
by DrHyde (Prior) on Nov 08, 2004 at 10:13 UTC | |
by ysth (Canon) on Nov 08, 2004 at 18:11 UTC | |
by DrHyde (Prior) on Nov 10, 2004 at 09:10 UTC |