Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
The perlsec and perllocale documentation pages make it very clear that if a program does:
use locale;Then untainting will not work if the regular expression used to untaint contains a character class - because under taint mode perl does not trust the locale information on the host system (if i understand correctly).
This makes sense, but what I can't find are any examples or documentation on the correct way to proceed when one wants to untaint text which may contain characters such as accented letters.
It sounds as if I somehow need to:
a) stop doing use locale
b) (re)define the common character classes such as \w myself to include things like accented characters
But I don't know how to do this, and i'm not really sure whether this is definitely the way to go.
Someone must have come across this problem previously. I would really appreciate your advice/guidence. It feels like I'm missing something fundamental
Surely many Perl web based applications have to untaint data that contains non english characters.
What is the correct secure way to untaint this data?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: untainting and locales and internationalisation
by dtr (Scribe) on Aug 24, 2005 at 14:00 UTC | |
|
Re: untainting and locales and internationalisation
by Anonymous Monk on Aug 26, 2005 at 19:39 UTC |