Remove or Identify Shell Commands In A Form

rongoral has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Wise Ones -

I seek a means to identify and, if desired, remove shell commands from online form content using pure perl 5.6. I would also like to be able to differentiate between shell commands and common HTML tags. The specific form allows for logged in users to change page content, thereby necessitating that they be allowed to input HTML tags, but I want to restrict hackers from submitting shell commands. Currently, I am using this (hacked) method to do it, but I find that it stops HTML tags. I use this through a standard library so all my .cgi can use it and all form submissions use it. I do not want to muck around with hidden form values and the like as this would remove the "generic" quality of my library. Any help would be greatly appreciated.

sub GetFormVars
    {
    #- Var Declaration And Initialization
    my ($hr_self,$hr_module) = @_;

    # Retrieve the form variables
    $hr_module->{FORM} = $hr_module->{cgi}->Vars;

    # Retrieve the user's IP
    $hr_module->{FORM}{ip} = $ENV{REMOTE_ADDR};

    # Iterate through the hash and transform the keys
    foreach my $key(keys %{$hr_module->{FORM}})
        {
        my $new_key = $key;

        # Change spaces to underscores and remove any trailing or lead
+ing spaces
        $new_key =~ s/^\s+//;
        $new_key =~ s/\s+$//;
        $new_key =~ s/\s+/_/g;

        # If there has been a change...
        if ($new_key ne $key)
            {
            # Make all keys lower case
            $new_key = lc $new_key;
            # Create the new hash element
            $hr_module->{FORM}{$new_key} = $hr_module->{FORM}{$key};
            # Remove the old key/value pair
            delete $hr_module->{FORM}{$key};
            }

        # Reset the key value
        $key = $new_key;

        # Stop people from using subshells to execute commands.
        $hr_module->{FORM}{$key} =~ s/~!/ ~!/g;
        my $value = $hr_module->{FORM}{$key};

        # Check for comments and stuff in the string
        if ($value =~ m/\<\!--\#(.*)\s+(.*)\s?=\s?(.*)--\>/ || $value 
+=~ m/[;><\*`\|]/)
            {
            # Blank out the value
            undef ($hr_module->{FORM}{$key});
            # Set an error message and return 0
            return $hr_self->PrepareErrorMessage(10403, {'\$hr_self->{
+FORM}{ip}'=>$hr_module->{FORM}{ip}});
            }

        # If this is a command, modify the value by changing spaces to
        #   underscores and making it all lower case
        map {s/\s+/_/g;$hr_module->{FORM}{$key} = lc;} $hr_module->{FO
+RM}{$key} if $key eq 'cmd';
        }

    return 1;
    }
[download]

Indeed, as Kappa points out, I am probably not expressing my question well enough. I want to be able to remove possible shell commands from user input while allowing HTML tags to remain. This as Zaxo stated:

"You'd need an accurate combination html/shell parser to sanitize shell constructs from html fragments."

But I do not understand:

"Why bother, if the text never gets shell-interpreted?"

I am also displaying my ignorance as I do not understand what Zaxo's answer is all about regarding this question. I am doing nothing at the command line. This is purely being done via web forms. Since the input is either put into text form and emailed or placed in a database or displayed in an HTML page, would the shell actually be hit? I sortof understand that shell commands can and do get executed when placed in the email header, but I do not understand if this is also a danger with a <textarea></textarea>.

Yes, I really am displaying ingnorance. I will follow rinceWind's advice and read Ovid's piece and perhaps gain some understanding of Zaxo's wisdom that way.

Comment on Remove or Identify Shell Commands In A Form Select or Download Code

Replies are listed 'Best First'.
Re: Remove or Identify Shell Commands In A Form by Zaxo (Archbishop) on Nov 26, 2004 at 15:32 UTC
Your better strategy is to avoid exposure by never feeding user input to the shell. If you must give arguments to utilities, use the list form of system, magic open, or exec so that no shell interpretation of the command line is done. You'd need an accurate combination html/shell parser to sanitize shell constructs from html fragments. Why bother, if the text never gets shell-interpreted? Your users might discuss shell programming without any ill intent. After Compline, Zaxo	[reply]
Re^2: Remove or Identify Shell Commands In A Form by rongoral (Beadle) on Nov 26, 2004 at 16:30 UTC
Thank you Zaxo, for the reply. However, my main concern is not to actually execute the commands, but to do so inadvertantly. For instance, if a field in the form collects an email address that is in turn used as a "reply to" and the form results are emailed to another, I do not want to open a window for the unkind people who may try to insert shell commands there to hack the site. The scope of the posted method is simply to gather the data from the form, do a limited validation of the data, and send it back to the calling script in the form of a hash_ref. The use of the form data is done within the calling script.	[reply]
Re^3: Remove or Identify Shell Commands In A Form by kappa (Chaplain) on Nov 26, 2004 at 18:58 UTC
If you follow Zaxo's advice you will achieve exactly that -- you'll save yourself from inadvertently running shell commands from malicious users. Just do not pass user input to shell. Or maybe we both cannot understand your question. --kap	[reply]
Re: Remove or Identify Shell Commands In A Form by rinceWind (Monsignor) on Nov 26, 2004 at 16:49 UTC
Also of relevance besides Zaxo's comments is Ovid's Web programming with Perl Course, which covers the subject of calling the shell from CGI scripts and threats in detail. -- I'm Not Just Another Perl Hacker	[reply]
Re^2: Remove or Identify Shell Commands In A Form by rongoral (Beadle) on Nov 26, 2004 at 17:29 UTC
Ah. Thank you rinceWind for the link to Ovid's place. I'll read this. Very helpful indeed.	[reply]