SavannahLion has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on untainting some directory path input and I would like a bit of advice on cleaning up directory paths.

I derived the following code from untainting user login information. I added in comments to show what I think is going on. Please correct me if I'm way off base here. $de is just some random variable name I popped in for testing. :p

My goal is to allow people to drill down in a specific directory tree such as Current_Directory/X/Y/Z but to refuse to allow them to use relative paths to leave the directory tree such as Current_Directory/../../Important_File_Name. So I came up with the following.

$de =~ s([^a-zA-Z0-9 _/\.-])()g; #strip everything that's not approved +. $de =~ tr(./)(./)s; #Look for repeats of the . and / and squash them. $de =~ m/(\w+.*)$/; #Strip off leading . and / just in case. $de = $1;
Is there anything important I'm missing here?

Is it fair to stick a link to my site here?

Thanks for you patience.

Replies are listed 'Best First'.
Re: Cleaning up directory paths.
by BrowserUk (Patriarch) on Nov 06, 2003 at 08:40 UTC

    Doing a similar thing, I was disappointed to find that none of the File::* modules wouldn't rationlise a path containing relative elements,and came up with this.

    #! perl -slw use strict; sub sanitizeDir { require Cwd; my( $path ) = @_; my( $cwd ) = Cwd::getcwd() =~ m[^(.*)$]; return unless do{ local %ENV; $path =~ s[^(.*)$][$1]; chdir $path; + }; my $sanitized = Cwd::getcwd(); chdir $cwd; return $sanitized =~ s[^$cwd][$cwd] ? $sanitized : (); } my $dubiousPath = $ARGV[0]; my $absPath = sanitizeDir( $dubiousPath ); if( defined $absPath ) { print 'Absolute path: ', $absPath; } else { print 'Invalid path: ', $dubiousPath; }

    Some tests

    P:\test>perl -T junk.pl8 . Absolute path: P:/test P:\test>perl -T junk.pl8 .. Invalid path: .. P:\test>perl -T junk.pl8 ./.. Invalid path: ./.. P:\test>perl -T junk.pl8 ./../. Invalid path: ./../. P:\test>perl -T junk.pl8 ./used Absolute path: P:/test/used P:\test>perl -T junk.pl8 ./used/.. Absolute path: P:/test P:\test>perl -T junk.pl8 ./used/../.. Invalid path: ./used/../.. P:\test>perl -T junk.pl8 ./used/././t/ Absolute path: P:/test/used/t P:\test>perl -T junk.pl8 ./used/././t/../ Absolute path: P:/test/used P:\test>perl -T junk.pl8 ./used/././t/../.. Absolute path: P:/test P:\test>perl -T junk.pl8 ./used/././t/../.././.. Invalid path: ./used/././t/../.././..

    The basic idea is to use the OS to rationaise the path and convert it to an absolute path. You can then check that it starts with the root of the subtree you want to expose. In the example, it verifies that the path specified is in the subtree below the current working directory, but you could pass in the required cwd as a second argument to the sub.

    It returns a fully rationalised and untainted, absolute path, or undef.

    This hasn't been tested on a vunerable system, so you will need to verify it for yourself, but hopefully their are enough experienced eyes here to spot any weakness in the approach or implementation.

    I believe it should be portable, but it has only neen tested under Win32.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Hooray!
    Wanted!

Re: Cleaning up directory paths.
by graff (Chancellor) on Nov 06, 2003 at 05:16 UTC
    Maybe this would be off the mark relative to what you really want to do, but... Rather than trying to filter user input to assure acceptable directory paths, you could specify in advance what the acceptable paths are (e.g. using the output of "find your_web_root -type d" on the command line), and offer those as a list to choose from. If the list were in a side frame, then the navigation ought to be tolerably effective.

    While I haven't tried it myself, I'm sure there would be fairly simple means available to emulate the sort of index that shows "top-level" items, with little "+" icons next to them when they contain sub-levels, and clicking on the icon expands the choices on the next level down, while clicking the name takes you to the specified item (i.e. lists the contents of that directory).

      I suppose that would work in a fashion. But even if I were to present a list with a fixed list of paths, I would still end up needing to clean up and untaint that incoming variable anyways. The only way I can think of, off the top of my head, to avoid cleaning the path is to assign each item in the list a number and compare that to an internal list of ensure that the path information is altered. But I feel that that is rather unwieldy. :-\

      Nice idea though.

      Is it fair to stick a link to my site here?

      Thanks for you patience.

        But even if I were to present a list with a fixed list of paths, I would still end up needing to clean up and untaint that incoming variable anyways.

        Right. Good point. (Sorry I didn't think of that at first... did I mention that web programming is something I do relatively seldom in my job?)

        So, if you have the list that you present on people's browsers, and you get back a parameter string, rather than trying to untaint the parameter string, you just need to check whether it's an exact match to a particular string in your list of allowable paths. And this would be easy if you just store the allowed path list as keys of a hash.

        Once you establish that it does match, you don't really need the parameter string after that (no need to untaint it) -- just use the matched item from your list (which the script reads directly from the server). And if there was no match, you just send whatever alternative feedback you deem appropriate...