And so it came to pass that
holli was given the task to develop an webapp using Catalyst and the Template Toolkit.
See!, the powers that must be obeyed commanded,
in this form the user shall input his Name, for so they can be rewarded with theirs emails being personalized. But!, and the skies rumbled,
Of course the name field will have to be checked via a regex to ensure there is no bad input!"
Some time later,
holli had finished the controller for the form and startet testing. To his astonishment
he had to pick up that the regex
he used to match the users against, did fail. More precisely it does not match german umlauts, regardless if they are matched against a word character (
\w) or an explicit character class
[äÄöÖüÜ]. Now
he knows about different encodings, but all files in his projects are encoded in UTF-8 (templates, html, code, everything).
Now, who can tell
that poor sod how to proceed to make the regex match?
P.S.
The actuall regex used is
qr/^\w[\w\s\-]+$/.
The umlauts seem to be correctly encoded in the request (
nachname=M%C3%BCller =>
nachname=Müller)