Input and Output Disciplines
There is currently no easy way to mark data read from a file or other external source as being utf8. This will be one of the major areas of focus in the near future.
So part of the problem may be this: You expect your query parameter is encoded in UTF-8 (I'm assuming), but your script just sees a sequence of extended-ASCII characters. You might be able to get around this by explicitly using pack "U",... to reconstruct UTF-8 characters from the input one at a time, but I don't recall if I ever got that technique to work reliably.
If you're just trying to ensure that an input string doesn't exceed a particular character length, you should be able to use length($string) to get its length in characters rather than bytes. That assumes that you already have it stored internally as UTF-8, of course, and that you haven't done a use bytes.
Unicode support in 5.8 is supposed to be much improved, but I haven't yet had a chance to try it for myself yet.
$perlmonks{seattlejohn} = 'John Clyman';
In reply to Re: Unicode word wrapping
by seattlejohn
in thread Unicode word wrapping
by lestrrat
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |