Tommy has asked for the wisdom of the Perl Monks concerning the following question:

I'm wanting to add a dictionary lookup at my website that just directly interfaces with the availabe dict shell command. I also want to do this securely. How should I go about doing this? —Not the implementation, but the sanitizing of data before it is passed through to qx// (the shell)?

Basically what I've got so far:

use CGI; use My::QXcall; $cgi = CGI->new(); my(@terms) = $cgi->param('t'); # do stuff sanitize input data, etc. foreach (@terms) { $_ =~ s/(?:^\s)|(?:\s$)//o; # trims leading/trailing whitespace # [other stuff I should be doing would go here...] } # look up terms with dict print QXcall(qq{dict ${\ scalar join(' ', map { "'" . quotemeta($_) . "'" } @terms) }});

Does that code do enough in the way of security? I'm wondering if taint should be figuring in here somewhere...or something(s) else?

...And by the way, here's the non-standard "My" module referenced in the above code if you'd like to see it to get some more context:

package My::QXcall; use strict; use vars qw( $VERSION @ISA @EXPORT ); use Exporter; @ISA = qw( Exporter My ); $VERSION = 0.00_1; # Wed Jun 11 17:14:17 CDT 2003 @EXPORT = qw( QXcall QXstatus QXsignal QXerror QXoserror QXerrli +ne ); our($STAT, $SIG, $ERR, $OSERR, $LINE); sub QXcall { undef($STAT); undef($SIG); undef($ERR); undef($OSERR); undef($LINE); my($call,$errmsg) = @_; my(@call) = `$call`; $STAT = $? >> 8; $SIG = $SIG{ $? & 127 } || $? & 127; if ($STAT) { $OSERR = $^E; $LINE = (caller)[2]||'?'; $ERR = $errmsg ? $errmsg . NL : '' . join("\n",@call) . qq[syscall failed with status $STAT, signal $SIG] . "\n"; } @call } sub QXstatus { $STAT } sub QXsignal { $SIG } sub QXerror { $ERR } sub QXoserror { $OSERR } sub QXerrline { $LINE } 1;

Thanks everyone!

--
Tommy Butler, a.k.a. Tommy

Replies are listed 'Best First'.
Re: Safely passing CGI form data to a shell command
by deibyz (Hermit) on Apr 21, 2005 at 16:33 UTC
    First, always -T! You can forget to untaint something a it can be dangerous.
    Second, if you're doing a dictionary search, I would only let /\w+/ strings, throwing out anything else.
      I was fearing this would prevent searches on hypenated terms ("-") and perhaps some other possibilities I hadn't thought of...
      --
      Tommy Butler, a.k.a. Tommy
      
        Ok, I should have been more specefic...

        I was trying to say that, instead of removing whitespace and quoting special chars, I would define what I want to match an then throw out the rest. I think that if you're searching in a dictionary for an English word, you'll only need [A-Za-z-] (not exactly \w+) if I'm not missing something. It's always safer to get less than expected than get something unexpected, and for sure you (and me) are going to miss something.

Re: Safely passing CGI form data to a shell command
by Tanktalus (Canon) on Apr 21, 2005 at 18:15 UTC

    First off, qx is not going to be easily secure. Use system LIST, or, in your case, IPC::Open2 with a list. (You can close the writer handle immediately since you don't need it.) This helps get rid of the shell, which is a huge annoyance except when it's a huge help. This isn't one of those help times.

    Second, as has been mentioned, you want to detaint whatever is passed in. You want to use /([[:alpha:]-]+)/ or something like that. [:alpha:] is a character class (thus must also be inside []'s) which honours locale information. Which may mean setting your locale (say to a UTF8 locale) and decoding the input to UTF8 as well (before detainting, of course) - I'm not too sure here because I've not had a reason to cross codepages before (that is, what is passed in may be a different codepage than what I'm running in).

      May I say, sweeeeeeeeeeet! Thanks All! I am confident the solutions suggested will, in summation, provide what I have sought. I have resolved upon using system() and -T + regexp.

      Again I thank you!

      --
      Tommy Butler, a.k.a. Tommy
      
        be carefull also with '-' and '+' chars. If you allow them, user could turn arguments in command options, i.e. '-foo'.

        Most Unix programs allow the double dash ('--') to be used to stop option parsing, so instead of ...

        system 'foo', $arg;
        ... it's better to use ...
        system 'foo', '--', $arg;

        :(

        I can't get around this error and stfw and rtfm is yielding no help.

        Can someone help me decipher the error message I'm getting? It's apparently fatal.

        Oh, and by the way. I've realized now that system() and exec() are useless to me as neither returns the output of my command. Duhhhhhh.

        from http://perldoc.perl.org/functions/system.html
        The return value is the exit status of the program as returned by the wait call. To get the actual exit value shift right by eight (see below). See also "exec". This is not what you want to use to capture the output from a command, for that you should use merely backticks or qx// , as described in "`STRING`" at perlop. Return value of -1 indicates a failure to start the program (inspect $! for the reason).

        Code:

        my(@call) = system($call);

        Error:

        Insecure dependency in system while running with -T switch

        --
        Tommy Butler, a.k.a. Tommy
        
Re: Safely passing CGI form data to a shell command
by tlm (Prior) on Apr 21, 2005 at 16:38 UTC

    I would require that those terms passed the most restrictive regexp (maybe /^[a-z']+$/i) that would capture the arguments to dict. I.e., adding your whitespace stripping:

    my @untainted = grep $_, map { s/^\s*([a-z']+)\s*$/$1/i; $_ } @terms;
    (NB: I have not confirmed that /^[a-z]+$/i indeed matches all valid arguments to dict.)

    the lowliest monk

      But then I wonder what happens to searches like "résumé" ?? Should I support dictionary lookups on words in other languages? Spanish for example has words like "tú".

      And that presents another error altogether. If I ask dict to look up such a word for me, it complains... "error: The request is not a valid UTF-8 string". But that error isn't a Perl-related I don't think. Or is it? Do I need to escape unicode in a special way before dict-ing for words that contain it?

      --
      Tommy Butler, a.k.a. Tommy
      
Re: Safely passing CGI form data to a shell command
by salva (Canon) on Apr 21, 2005 at 16:50 UTC
    in order to run the external command, it's safer to fork a new process via open '-|' and then execute the external program as exec @list
    # untested my @out; my $pid=open my $pipe, '-|'; defined $pid or croak "fork failed"; if ($pid==0) { exec $command, $opt1, $opt2, ..., '--', $arg1, $arg2, ... exit(0); } else { @out=<$pipe> }
    exec @list doesn't use the shell to parse and run the command so you don't need to bother about a malicious user passing things like "hello; rm -rf /"...

    well, you have to be sure that the called program doesn't pass its arguments to a shell either!!!

      I'm curious what creating another process accomplishes? Why not just call exec() from within your own process? Could you explain? Thanks!

      --
      Tommy Butler, a.k.a. Tommy
      
        fork and exec is just what qx// does (w/o the shell thing).

        Calling exec with out forking would not let you capture and postprocess the command output... actually the command output would be sent to the remote browser!

Re: Safely passing CGI form data to a shell command
by tlm (Prior) on Apr 21, 2005 at 18:02 UTC

    You will find this perlfaq item of interest: How can I call backticks without shell processing?

    Update:I should add that if you bypass the shell with exec (as salva and the FAQ above suggest), then there is less concern about malicious arguments such as ";rm -rf * *.*", which is what my other reply was addressing, but I don't know enough about dict to be able to categorically say any argument passed to it this way would be safe. FWIW, my guess is that it would be, since I just can't imagine a DoS attack, for example, based on some special argument to dict.

    the lowliest monk