I use taint mode in all of my CGI programs and am starting to wonder if I'm being too restrictive in some cases.
Generally when 'free text' input is required, I use a regex to ensure it matches \w and a small number of
punctuation characters, and substitute line-breaks with <br>'s.
The data I'm taking about here is stuff that will be getting stuffed into a database (using placeholders)
and getting displayed again as HTML (going through CGI's escapeHTML method),
it will not be used as a filename, sent to system calls, etc.
I'm now in the position of wanting to allow similarly 'free text' UNICODE input and I don't know realistically what to
allow.
I'm quite tempted to allow anything other than the null byte, which is the only thing I can think of that might
mess up either the database insertion or the HTML display.
However, I've always practiced making sure the data contains only what I do want to allow, not what I don't.
I've
super searched for "taint unicode" and haven't found anything that really helps.
I've read the core perl unicode docs and understand how to untaint using unicode character classes
Can anyone give me some advice or real-world examples?
Does perlmonks.org use taint mode and how does it untaint the
Seekers of Perl Wisdom "Your question" input?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.