Hello,
I am trying to parse a file with French accents, but accented characters appear to get transformed. If I read in a text with accents and simply reprint this out, I see that the accented characters are replaced by a totally different character in each case. Here are two examples with the original word first and transformed word second.
(modèle
, modÞle & État, ╔tat)
In a regular expression, I was unable to recognize these characters using either \w or \W.
I am not sure how to fix the problem? It seems there should be a module to add to the perl program I downloaded. How do I do this?
Many thanks