Hi all
I need to parse text data stored in utf-8. I want to use Parse::RecDescent. But because parser code generated by Parse::RecDescent module don't include 'use utf8;' I can't even simply split text to words!
I can use in my grammar regexp like
to specify what belong to word, but this small example will work only for small subset of languages with LATIN alphabet. And for all other languages...???/[\wÄÜÖäâáçëéîíöôóôüúß']+/
Or is it better to modify Parse::RecDescent module to add 'use utf8;' to generated code? Is it safe?
In reply to Parse::RecDescent and utf-8 by ph0enix
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |