This is probably a simple question with a complicated answer. I'm working on a website conversion from ASP->BAMP and am trying to dot as many i's and cross as many t's as possible.

That means, full completed Pod in every module. Test of the core in Test::More. Testing of the tablibs and pages in Apache::Test. And towards the top of the list, making sure that all user input is expected, safe as possible and untainted.

Sounds easy right? Well most of that list has been easy, but that last one has me stumped. I have no idea where to start when it comes to scrubbing user input on the web.

Sure, some of those things are easy to check. If it's a quantity field, only allow digits between 1 and the max allowed. But, what if they're inputting the description of something into a text area? What about the name of a product or the name of a vendor or company? Filtering for only a-zA-Z0-9 isn't practical. What about UTF and foreign characters?

Sure, I can disallow:

` . ; \ / @ & | % ~ < > " $ ( ) { } [ ] * ! '

But, is that the correct answer? No periods to end a sentence? No $ sign. Now exclamation? That's not very realistic either.

HTML::Sanitize seem to only be for HTML. Then there's Safe, but that merely shift the problem to a safe compartment.

So after all my rambling, what are fellow monks doing in the real world?


In reply to Preferred Way of Scrubbing User Input Before DB Write by jk2addict

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.