What Zaxo said.
I seem to getting a lot of milage out of this advice recently, but have you considered using iconv or recode for this task? These are not Perl anything, but tools designed to convert between character encodings (recode is available for GNU/Linux and Windows--see this link, iconv is distributed with RedHat GNU/Linux distros).
--
Allolex
In reply to Re: regex for utf-8
by allolex
in thread regex for utf-8
by jjohhn
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |