in reply to problems with locale

The locale pragma does not change the behavior of character classes like [a-z0-9,:;"' ] in regular expressions. If you want to get the accents as expected you should be able to use something like [[:lower:]0-9:;"' ] instead. These are described in perlre. Search for "POSIX character class".

update: I should clarify that POSIX character classes aren't the only way locales are supported in regex's character classes, of course. As chromatic says, the reason that yours didn't work is because you enumerated a-z explictly. As long as you aren't using explict ranges like that including accented characters, etc. won't be a problem; things like \w work as expected under use locale.

(Why aren't you using or die ... on the second open there? Also, perl lets you use / instead of \\ for things like open. )

Please turn on perl's features to help you.