Perl can hadle these and more. perl uses Unicode when warranted and uses the utf-8 encoding in particular. Check out perlunicode for general concepts and perlretut and perlre for advice on character classes (of which \w is implicitly) and coding of such characters.