Update 12/7/03:
Apparently, the POE people had to deal with this very same problem. See how they did it.
Hi Monks.
At the moment, I'm working on some network libraries and attempting to add to them support for Unicode. My problem is that Perl's system IO functions--the ones I'm using, sysread() and syswrite()--all take the length to read/write in bytes, and yet I can't seem to find any portable way to get that information.
As of 5.6.1, strings are stored internally as UTF-8 and all built-in functions that purport to operate on characters do operate on characters; namely length(), which now returns the length in characters as opposed to the length in bytes.
To force length() to return the length in bytes, perlunicode says you can use the bytes pragma, as this example illustrates:
#!/usr/bin/perl -w use 5.6.1; use strict; # three smiley faces: my $string = "\x{263a}\x{263a}\x{263a}"; printf("%s: %d characters\n", $string, length $string); { use bytes; printf("%s: %d bytes\n", $string, length $string); }
That's all well and good, but unfortunately, the bytes pragma was introduced as of 5.6.1, and my libraries are supposed to support perl back to 5.005. I can't wrap "use bytes;" in an eval block, so what can I do?
In reply to Perl + Unicode == Networking Woes by William G. Davis
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |