Don't untaint all your data. That defeats the purpose of running in taint mode. Leave the data tainted, unless taint mode stops you from doing something you need to do, and then just untaint (carefully) the ones you need to use that way. In fact, if you use a regex to parse fields out of something, you should mark the extracted fields as tainted unless your regex was carefully constructed to make sure they're safe. The whole point of Taint mode is to alert you when you're doing something potentially unsafe. At that point, you want to check the datum you're doing it with specifically in terms of the operation you're performing, to make it safe for that. For example, if you're doing a system call that will be interpreted by a shell, you want to strip shell metacharacters. But you don't need to strip shell metacharacters when you send an email.

MySQL can store anything safely, if you use ? and pass in the value in the execute() call. However, you need to think about what you're going to do with the data when they come out of MySQL. If you don't check them before you put them in, you mark them as tainted when you take them out.

As far as content going to the browser: decide whether its plain text or HTML. If it's text, just encode the entities and have done. This is easy (there is a module for it on CPAN) and as safe as is necessary for ordinary purposes. If it's HTML, you'll want to check it for certain dangerous things, like scripts, and personally I also like to minimally parse it (basically just check for wellformedness), and if it's not wellformed revert to treating it like plain text (i.e., encode entities). This will annoy people who like to write old-style HTML with <p> tags between (instead of around) paragraphs, but it will also prevent any number of easy-to-make stupid mistakes, like forgetting to close off a table (which causes huge problems for older browsers).

For email: if you're sending as text/plain, which you should be, I wouldn't worry about it too much. There are tricks that can be played to make Outlook think something is an attachment even though the headers don't say so, but people who use Outlook are going to get viruses regularly anyway, so don't sweat it.


$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/

In reply to Re: Back to acceptable untainted characters by jonadab
in thread Back to acceptable untainted characters by bradcathey

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.