tanger has asked for the wisdom of the Perl Monks concerning the following question:

hi,
I'm using taint mode and would like to taint valid usernames and passwords. basically i'm not sure what to allow and what not to allow. Should passwords only allow alphabetical and numerical characters? or can a password be "This_is-my|pass,,word" I'll be storing the password in a MD5 hash in mysql db that field is VARCHAR(16). So If i was to do a regex for the password input from the user, would I allow just alphabetical and numerical characters? If yes is this the proper regex/taint method:
$user_pass = $INPUT->param{'pass'}; if ($user_pass =~ /^[A-Za-z0-9]+$/ && length($user_pass) < 17) { $user_pass = $1; }
Are there any implications to what I'm trying to do? I'm not to strict with password rules too, the only one I have is for the members password to be 6 charcters or longer.

What about the username? If I want to allow a username like "Perl_Monks" how can I make the proper regex for it? Is the "_" or underscore character part of the A-Za-z0-9 patter?

Or am I worrying about this issue too much? Basically I want to secure the script as much as possible so no one can enter any unwanted input to my script.

ty
tanger

Replies are listed 'Best First'.
Re: Regex usernames and Passwords
by Joost (Canon) on Apr 16, 2005 at 20:44 UTC
    Your code looks fine to me. You might want to alert the user (or at least abort the program) if the match doesn't succeed, though.

    You seem to have about the right amount of paranoia :-) Just remember that it's better to match for well-formed input instead of trying to match "bad" input. You're doing that correctly here. Also, mysql char/text/blob columns will generally handle any input that is short enough, as long as you use placeholders or $dbh->quote. You might have other reasons to restrict/quote input, though. For example, if you print user input to an HTML page, you'll probably want to use $cgi->escapeHTML(), and for a phone number, you might not want alphabetic characters...

    Underscore is not part of the [A-Za-z0-9] pattern. Fortunately \w is equivalent to [A-Za-z0-9_], so you can save a few keystrokes :-)

    By the way; I'd probably give users a little more characters to use in their passwords ( what about !@#$%^&*()_+{}{}:;.,<>" and ?).

Re: Regex usernames and Passwords
by graff (Chancellor) on Apr 17, 2005 at 04:16 UTC
    Since the only thing you'll do with the password string is to get its MD5 signature and insert or compare the MD5 to the database, there's no reason to limit the characters being used. (If someone figures out how to include a null byte in their password string, more power to them!) Just check for min and max length (maybe not even max length).

    As for user name, sticking to strings that match /^[-\w.]{4,10}$/ should suffice; that allows 4 to 10 characters that must all be alphanumeric, underscore, dash or period. (Adjust length constraints to suit your taste.)

Re: Regex usernames and Passwords
by polettix (Vicar) on Apr 17, 2005 at 18:23 UTC
    Don't limit the character set for passwords, if you do so you're giving more power to a potential attacker. As a matter of fact, people is usually encouraged to put some non-alphabetical non-numerical character in their password, to reduce the possibility of a dictionary attack and augment the total entropy of their password, thus making it more "secure".

    Moreover, given the fact that you're using this string only to compute an MD5 summary, it should really not be a problem for your script to allow any character, as per previous suggestion by graff.

    Flavio (perl -e "print(scalar(reverse('ti.xittelop@oivalf')))")

    Don't fool yourself.
Re: Regex usernames and Passwords
by gam3 (Curate) on Apr 16, 2005 at 20:42 UTC
    It look like you are on the correct path to me.
    -- gam3
    A picture is worth a thousand words, but takes 200K.
Re: Regex usernames and Passwords
by Smylers (Pilgrim) on Apr 18, 2005 at 09:01 UTC
    if ($user_pass =~ /^[A-Za-z0-9]+$/ && length($user_pass) < 17) { $user_pass = $1;

    $1 won't contain anything unless you put some capturing parens in in the pattern.

    Smylers

Re: Regex usernames and Passwords
by Anonymous Monk on Apr 18, 2005 at 10:00 UTC
    I'm using taint mode and would like to taint valid usernames and passwords. basically i'm not sure what to allow and what not to allow. Should passwords only allow alphabetical and numerical characters? or can a password be "This_is-my|pass,,word"
    I don't see a reason to put any limitations on passwords, except disallowing characters that make it hard to input passwords, or to sling them around between functions. I'd forbid the NUL character "\x00", carriage return, "\cM" and linefeed, "\cJ".

    As for usernames, it depends on what you are going to use them for. If it's purely used for authentication, there aren't many limits. If user names also need to be printed, you probably want to disallow unprintable characters, and maybe whitespace as well. Perhaps you want user names to be case insensitive, or perhaps you don't care to have a "carry", "Carry" and a "CARRY" on your system. What to allow and what not to allow for user names is something you should decide, and you should be guided by what you want to use (or plan to use) usernames for.