http://qs1969.pair.com?node_id=11134028

Bod has asked for the wisdom of the Perl Monks concerning the following question:

I'm taking my first tentative steps into using taint mode!
A problem has quickly got me rather stuck...

I want to use a module in a relative path but taint mode removes '.' from @INC
If I do this:

#!/usr/bin/perl -T use CGI::Carp qw(fatalsToBrowser); use Site::HTML; use strict; use warnings;
It complains that Site::HTML cannot be found.
So I called on FindBin to help like this:
#!/usr/bin/perl -T use CGI::Carp qw(fatalsToBrowser); use FindBin qw($Bin); use lib "$Bin"; use Site::HTML; use strict; use warnings;
But now get an error Insecure dependency in require while running with -T switch at migrate.pl line 7.

Is this the correct way to load a relative module under taint mode?
Is the problem with the way I am loading the module or with the module itself?
Should I be doing something completely differently?

Replies are listed 'Best First'.
Re: Using relative paths with taint mode
by haj (Vicar) on Jun 19, 2021 at 19:13 UTC
    but taint mode removes '.' from @INC

    As of Perl 5.26, the current directory isn't in @INC any more, regardless of taint mode.

    But now get an error Insecure dependency in require while running with -T switch at migrate.pl line 7.

    It is safe to assume that paths returned by FindBin are tainted, and there's a reason behind that. For example, if someone creates a symlink to your script in some other directory and starts via the symlink, then FindBin will report the directory of the symlink! So, the script is loading relative to the symlink and not relative to your script - and this could be a malicious module. Protecting against this type of attack is an explicit purpose of taint mode.

    The important question is: Against which sort of threats do you want to defend by using taint mode? If you load from a relative path, then someone might load and execute malicious code. It is the responsibility of your script to decide whether it wants to enable that by untainting the library before using it.

      The important question is: Against which sort of threats do you want to defend by using taint mode? If you load from a relative path, then someone might load and execute malicious code. It is the responsibility of your script to decide whether it wants to enable that by untainting the library before using it.

      In this situation, for someone to replace the module with malicious code would mean they have access to the directory structure of the website and the cgi-bin. If someone want to do some harm with that level of access, they could do it much more easily then interfering with a module. My best guess is that the only people who could create a symlink are the server admins who, again, could do damage in other ways if they were minded to.

      So, I am thinking that untainting $Bin isn't much of a practical security risk in this instance.

      Is that sensible or am I being overly optimistic?

        This is exactly the consideration I wanted you to do. Perl doesn't know that your script is supposed to be called via the web, but you do. Your reasoning about server admins is ok - they would not need to exploit your use of $Bin to do harm.

        I'd change something which isn't related to taint mode, though: In your setup with use lib $Bin;, you have your libraries within the cgi-bin path. This is unhygienic since your libraries are now exposed to attacks from the web. At least you need to consider what happens if someone points his browser to http://your.stuff/cgi-bin/Site/HTML.pm.

        In a typical CPAN-like setup you have two different directories for scripts and libraries, so you'd usually end up with use lib "$RealBin/../lib";. This would allow to install that stuff "somewhere" and then symlink to the script (and only to the script) from your cgi-bin directory. That way, only the script's URL is exposed, and $RealBin will resolve the symlink and find the installation directory with the libraries for you. The web server might need a directive to allow symlinks to do that.

        So, I am thinking that untainting $Bin isn't much of a practical security risk in this instance.

        Is that sensible or am I being overly optimistic?

        Well, you could try it to see it. (DON'T!) Once your server has been taken over, you know you were too optimistic.

        Unfortunately, this is a little bit similar to the halting problem. Until your server has been taken over, you can never be sure that it won't be taken over.

        You could "blindly" untaint $FindBin::Bin, accepting any value and hoping for the best, without ever being sure. You could validate and thereby implicitly untaint $FindBin::Bin. Or you could use a hardcoded absolute path.

        What is your intention? Are you writing for a limited set of machines, maybe a single machine, with a known configuration? Or do you want unlimited distribution to machines with unknown configurations?

        In the first case, I would hardcode the absolute path. In the second case, I would distribute all modules found in the private lib directory via CPAN, or at least in form of a CPAN-compatible archive, and have them installed like any other modules in some of the regular directories listed in @INC, no use lib needed.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      As of Perl 5.26, the current directory isn't in @INC any more, regardless of taint mode

      So, I am getting the point that taint, and later versions of Perl are trying to make it difficult to use relative paths for modules!

      This raises a different question...
      Where should the modules be located?

      If there was just one 'version' or environment of the website then it would be easy to just put them somewhere above the website root in the filesystem. I do this with modules that are common to all my websites. But there are always 2 and sometimes 3 different environments for every website. Production, test and sometimes development.

      /home/myusername/somewebsite/prod/cgi-bin/Site/ /home/myusername/somewebsite/test/cgi-bin/Site/ /home/myusername/somewebsite/dev/cgi-bin/Site/

      Having the modules in a relative path allows them to be logically separated in each of the environments and to be developed and tested before being released into production.

      Is the solution to locate the modules above the website root with a different subfolder for each environment?

      /home/myusername/somewebsite/prod/cgi-bin/ /home/myusername/somewebsite/test/cgi-bin/ /home/myusername/somewebsite/dev/cgi-bin/ /home/myusername/somewebsite/perlmodules/prod/ /home/myusername/somewebsite/perlmodules/test/ /home/myusername/somewebsite/perlmodules/dev/

      I tried this for template files on the first site I created using Template. It was a bit messy to maintain so instead, for future projects, I have stored the templates in under the webroot but protected from HTTP access by putting an index.html file in the directory which sends the user to the homepage.

      Is there a better way to handle this issue - what is the accepted norm for locating Perl modules on a webserver where there are multiple sites and multiple environments within each site?

        So, I am getting the point that taint, and later versions of Perl are trying to make it difficult to use relative paths for modules!

        My reading of it is that they are trying to make it more difficult to use relative paths for modules accidentally. It it still trivially easy to use taint mode with relative paths on purpose. I do this frequently.

        Is the solution to locate the modules above the website root with a different subfolder for each environment?

        My solution is not to run dev, test and prod on the same machine. If you don't have multiple machines (why ever not?) then use multiple paths like this:

        /var/www/devsite/lib /var/www/testsite/lib /var/www/prodsite/lib

        This keeps the relative path the same across all three sites. Then all you need to do to use your modules is either:

        use lib '../lib'; # explicit, hard-coded relative path.

        or to save counting multiple ../../../ -

        use lib "$ENV{DOCUMENT_ROOT}/../lib" =~ m#^(/var/www/[a-z]+site/html/. +./lib)$#; # path relative to docroot

        and you're done. Simple, effective, secure.


        🦛

        So, I am getting the point that taint, and later versions of Perl are trying to make it difficult to use relative paths for modules!

        You may put it like that. It turned out that too many people get it wrong and catch security holes, so making it difficult (but not impossible) gives people a chance to ponder over other approaches.

        If a website has more than one environment, then you need a plan anyway (again, nothing to do with taint mode) how you deploy and maintain the files in your different environments. There are many solutions for that, but I'd go for something like this:

        /home/myusername/somewebsite/prod/cgi-bin /home/myusername/somewebsite/prod/lib /home/myusername/somewebsite/prod/templates
        with the same subdirectories for dev and test. So each environment has its own base directory, but below that they all have the same structure. Then it is indeed possible to use FindBin to detect which environment you're actually in (assuming you don't run a persistent interpreter like mod_perl).

        my ($prefix,$website,$environment,$basedir); BEGIN { $prefix = '/home/myusername'; $website = 'somewebsite'; use FindBin qw($RealBin); if ($RealBin =~ m!$prefix/$website/(dev|test|prod)/cgi-bin!) { $environment = $1; # This is now untainted! $basedir = "$prefix/$website/$environment"; } else die "Bad or no environment '$1'"; } use lib "$basedir/lib"; my $tt = Template->new({INCLUDE_PATH => "$basedir/templates"}); ...;

        The BEGIN block is needed to do the necessary calculations during the compilation so that the directory is available when use lib is processed.

        Other alternatives include setting the environment as an environment (sic!) variable in the corresponding section of the web server config. Environment variables are tainted, so again you need to validate/untaint them.

        "later versions of Perl are trying to make it difficult to use relative paths for modules!"

        They made perl safer, not more difficult to use. local::lib has been mentioned you a couple of times previously.

Re: Using relative paths with taint mode
by hippo (Bishop) on Jun 19, 2021 at 18:12 UTC
    Is this the correct way to load a relative module under taint mode?

    It is a way but TIMTOWTDI, as usual. Where you are coming unstuck (AFAICT) is that you are not untainting $Bin between lines 5 and 6.


    🦛

      Thanks hippo
      I thought $Bin (or any variable) could only be untainted through a regular expression. Is it something in lib that will be untainting it?

      edit: Sorry - I misread hippo's comment...my befuddled brain skipped over the word not.

        I thought $Bin (or any variable) could only be untainted through a regular expression.

        Correct. (Well, not quite. But I don't see how the other way documented in perlsec - see also Re: When not to use taint mode - could do anything to improve security.)

        Is it something in lib that will be untainting it?

        No, and lib would be the wrong place for automatic untainting. How should lib know which paths are secure and which ones aren't? How should lib know which string is a valid path, and which is not? At least lib would have to accept a regular expression to validate and untaint paths. (This is what File::Find does for the untaint and untaint_pattern options.)

        Note the wording in the previous paragraph: a regular expression to validate and untaint. You don't just want to blindly untaint. You want to validate the input. Untainting of the input is just a welcome side effect of the validation.

        By the way: you generally want a positive rule, describing how valid input looks like. You don't want negative rules that forbid invalid inputs. Simply because it is too easy to forget some invalid input.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Using relative paths with taint mode
by ikegami (Patriarch) on Jun 20, 2021 at 06:26 UTC

    Imagine if I ran the following commands:

    cd /tmp mkdir Site printf '%s\n' 'print "0wn3d\n";' >Site/HTML.pm ln -s /path/to/script.cgi script.cgi ./script.cgi

    This is exactly what -T is suppose to prevent.

    Update: Original exploit didn't actually work.

    Seeking work! You can reach me at ikegami@adaelis.com