I wasn't too clear in my original witeup.

In addition to the "data," "field," "obligatory," "min," and "max" parameters, you have a "regex" parameter that contains a string that contains characters which you put in a character class and then compare against the actual parameter value. This is very limiting, though. Instead of only being able to check if a parameter value contains certain characters, why not extend the sub so you can compare the actual parameter value against entire, complicated regular expressions?

First, assuming you change the name of the option from "regex" to "value," change this:

if ($checks{'data'} =~ /([^$checks{'regex'}])/) { bail_out("Bad input."); }

To this:

if (defined $checks{'value'}) { if (ref $checks{'value'} eq 'Regexp') { bail_out("Bad input.") unless ($checks{'data'} =~ $checks{'value'}); } else { bail_out("Bad input.") unless ($checks{'data'} eq $checks{'value'}); } }

Then you'd use the sub as follows:

$email_address = sanitize( data => param('email'), field => 'Email address', obligatory => 1, min => 9, value => qr/^[a-zA-Z0-9\.]+\@[a-zA-Z0-9\.]+$/, ); $color = sanitize( data => param('color'), field => 'Color', obligatory => 1, min => 3, max => 4, value => qr/^(?:red|green|blue)$/, );

Notice, now the parameter value is compared against an entire regex as opposed to just a character class. The original example would be rewritten as:

$username = sanitize( data => param('username'), field => 'Username', obligatory => 1, min => 8, max => 8, value => qr/^[%;&()#\w ]+$/, );

Yeah, it's a little more typing, but being able to compare parameter values against custom regular expressions is an absolute must. If you know a parameter value can only contain one value, like "on," because of the ref() checking you can just use a string and sidestep the regex engine and its overhead entirely:

$autosave = sanitize( data => param('autosave'), field => 'Autosave', obligatory => 0, max => 2, value => 'on', # not a regular expression; $checks{'data'} mus +t eq "on" );


All to often, programmers pick poor names (I'm especially guilty of this), either because they don't put enough thought into them, because they mix their metaphors, or because they come up with creative, long-thought out names that make perfect sense to them but are completely unintuitive to everyone else. To be perfectly honest with you, I think you could have come up with a better name for the sub itself. Your sub checks a value to make sure it's "clean," so you called it sanitize. But sanitize is verb derived from sanitizing, meaning "to clean something" (which makes me wonder why you didn't use the shorter "clean" instead). But your sub doesn't actually "clean" anything. Cleaning would involve converting or changing something, but all your sub does is examine something. The name "sanitize" would be far more appropriate for a sub that, for example, converted potentially dangerous HTML characters like <, >, and & to their HTML entity equivalents and all newline characters to <br>; but even for that, "escape" would probably be more appropriate.

Maybe you should call the sub "validate_param" or "check_param" (variants on actual names I've used) or, if you really like sanitize, "is_sanitized" instead?


In reply to Re: Re: Re: Check user input - reimplementation by William G. Davis
in thread Check user input - reimplementation by kiat

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.