in reply to Re: the search string and me
in thread the search string and me

Your code dosen't handle urls that use ; as the parameter separator, and it dosen't properly handle multiple values for one parameter name.

It also presents a huge backdoor, as it will allow any attacker to overwrite any global scalar variable in your script by sending a carefully crafted query against it. You don't show the tokenise subroutine, but I see that using a parameter name of {main::foo} will set/overwrite the global variable $main::foo.

Furthermore, any query with the string *amp* in it (for example in a search) will mutilate the whole query string - this must at least be documented, and is poor practice. The same goes for *plus*, and there even is no reason for that.

There is a reason why people use CGI.pm or its lighter cousin, CGI::Lite, as it presents a safe and relatively foolproof way of decoding script parameters.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web

Replies are listed 'Best First'.
Re: the search string and me
by jonadab (Parson) on Sep 12, 2003 at 14:12 UTC
    dosen't handle urls that use ; as the parameter separator, and it dosen't properly handle multiple values for one parameter name

    Neither of those things is needful, generally, and since this is his own function, which he is using in his own scripts, he has complete control over whether the scripts use those esoteric features.

    It also presents a huge backdoor

    Yes, absolutely. Rather than assigning directly to global variables, he should be storing the input in a hash.

    Furthermore, any query with the string *amp* in it

    Indeed. See my obfuscated version above, which handles this correctly. (Its larger, unobfuscated prototype, the function I normally use, also handles some things that I stripped out for brevity, such as file uploads, but those things are not needed for most CGI scripts.)

    There is a reason why people use CGI.pm or its lighter cousin, CGI::Lite

    Yes, but as came up in a recent unrelated thread, there are also good reasons, especially for scripts that may be deployed in various locations under various circumstances, to avoid using any non-core modules (or, in fact, anything that hasn't been in core at least since 5.003). Also there are good reasons for generating all the HTML yourself, as it allows you to guarantee certain things about its structure. It is of course certainly possible to use a module for fetching the input and still generate the output yourself, however.


    $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
      can you give me a link to your "See my obfuscated version above.." you mentioned in relation to parsing & and the use of "*amp*"

      Thanks for the comments,

      ___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com
        can you give me a link to your "See my obfuscated version above.."

        Yes, here.

        That's been deliberatly obfuscated, but in a nutshell I just take the raw input from the CGI query, split on ampersands, do a foreach loop over the result, split on equal signs, feed the resulting list through a map that unmangles the rest of the characters (including any encoded ampersands and equal signs, incidentally), and the result will be a key/value pair that can be stored in the input hash. Splitting first, *then* unmangling the result, has the benefit of removing from consideration any complications that might otherwise result from getting mixed up about what is and what is not a demangled ampersand or equal sign. It's also simple and straightforward (when it hasn't been deliberately obfuscated). This only handles key/value pairs delimited by ampersands, with the equal sign separating the key from the value, but all GET queries and most normal POST input will come to you that way, even if you use a big fat textarea. (I believe my non-obfuscated version also handles the case of an equal sign in the value (though of course there can't be one in the key unless it's encoded) by using the extra optional arg to split; I golfed that out of the obfuscated version.

        The only circumstances I've encountered thus far where you get something different (apart from cookies, which come in their own environment variable) is with file uploads; in that case you have to parse a custom boundary marker and deal with multiple parts and other stuff, and it does get rather more complex; if you need to do that, I suggest getting a module that does it for you, unless your purpose in rolling your own is to learn how it works.

        I rolled my own to figure out how it works, but I only ever used it to test that I had it working; since then, I've only used the regular kind of input, not having a need for file uploads. If I need to get files to the server, I ssh in and wget them from the Apache on my workstation, which has the added benefit of being resumable if my dialup connection dies partway through. If at some point in the future I discover a need to parse file uploads and store the file, I'll probably look for a module that does that; my own routine for that is too complex for me to be sure it doesn't have lurking bugs. However, for regular CGI input, I am fairly confident in my implementation; it's very simple, very straightforward. In the form I normally use it, it's a sub in an include file that I require, and it returns a reference to a hash containing all the input, (with all of the keys and values marked as tainted, in case I should forget myself and try to do anything insecure with them). The calling script usually dereferences and assigns the result to a global hash called %input


        $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
Re: Re: Re: the search string and me
by wolis (Scribe) on Sep 15, 2003 at 03:57 UTC
    Thankyou for your comments

    If a variable were overwritten by a crazy, deranged or plain curious person, could that have more serious effect than stop the code from working?

    If I never try to execute a scalar variable as a command, the worst it can do it perhaps display incorrect values.. yes?

    Or could some clever trick be performed at my initial eval statement?

    As for the *amp* stuff.. that was for resolving a particular issue I had at one stage and has remained as it has not caused problem since :-)

    ___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com

      I think you take the wrong approach to interaction on the internet: Your users should never be regarded as crazy, deranged or plain curious, but as malicious. And yes, I do think that overwriting a global value can have severe effects. But you use eval to set that string. So if I would craft a query parameter named [system q(rm -rf /)], that code would be executed by your eval statement.

      You could do some dereferencing via a hash to fill the variable with the parameter to get around the eval statement, but let's face it - CGI.pm and its cousins already do that and in a tried and tested way.

      perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
        Ah.. <penny drops> now I' m begining to understand.

        Being a windows user I am slowly understanding all of this stuff and the relationship with special variables and the underlying system.

        I only added the eval bit as I got tired of having to manually convert $username = $params{'username'}; each time I wanted to display the username variable (again my own code which replaces tagged fields with variables and I was unable to get it to handle hash or list values but it was happy with scalars).

        So 'taint' sounds interesting.. must do some reading about that :-)

        ___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com
      If a variable were overwritten by a crazy, deranged or plain curious person, could that have more serious effect than stop the code from working?

      Potentially, if crafted in malice, depending on what your code does with global variables. (Bear in mind also that the special variables are vulnerable under your implementation.) You could think through each and every global variable and each special variable to determine whether anything your script does could have bad effects if one of these variables holds a malicious value, or you could store the input in a hash and save yourself that effort. Running under taint mode would also help to curb this threat or at least make it much harder for anyone to exploit.


      $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/