Orthogonal code. You've probably heard of it. You may understand it. If "orthogonal" describes your code, you're a happy programmer -- and so are the programmers who have to maintain your code.

Orthogonal, in this context, means that a change in one place in your code will not affect anything else in your code. Using a car as an analogy, if you step on the brakes, you don't want your car to pull sharply to the right. If you turn on your windshield wipers, your high beams shouldn't come on. Cars are extremely orthogonal to the end user. If they weren't, we'd have many more car crashes.

Your code should also be orthogonal. Isn't it nice to know that to reconfigure many programs, usually you just need to change the config file and restart? I worked on one system that required -- simply to add an item to a pull-down menu -- that I rewrite some Javascript, some Perl code, some SQL, and tweak the database. The Javascript, Perl, and SQL were each in a different file. This, clearly, was a non-orthogonal system. Maintenance was a nightmare and the code was (surprise!), buggy.

So, here I am thinking about orthogonality when Stamp_Guy (whose name is used with permission), was asking me how to securely verify whether or not a user-supplied path is safe to write to from a CGI script. Hmmm... we have a problem here. If there is only one path, the user doesn't need to supply it. Clearly we can have multiple paths. Ordinarily, if I need to do something with a directory, I try to write my code so that it is flexible with its environment. Directory doesn't exist? Create it. Has my program been moved? Maybe I should make directories relative to the program's location. By thinking about these needs, I make my code more orthogonal. When (not 'if') its environment changes, it's not affected.

But what about security and multiple target directories that we need to validate? My thought is to reconsider orthogonality. Here is one case where I don't want my code figuring out what to do on the fly. Why? I don't want to take a chance that either I or another programmer is going to introduce a bug that compromise the security. I would much rather have my program die, or spit out copious error messages than find out that I've been rooted.

First, here's some code that I threw together for this problem:

#!/usr/bin/perl -wT use strict; use CGI qw/:standard/; my %paths = ( '/somepath/data' => '2', '/somepath/data/bob' => '1', '/somepath/data/alice' => '1', '/somepath/data/tom' => '1', '/somepath/config/foo' => '2', '/somepath/config/bar' => '2', '/somepath/config/bar/baz' => '1' ); my $tainted_path = param( 'path' ); # this line will take things like 'foo/bar' or '///fooo///bar' and ret +urn '/foo/bar' # note that all paths are assumed to be absolute. Easier that way. $tainted_path = '/' . join '/', grep { $_ !~ /^\s*$/ } split '/', $tai +nted_path; my $clean_path = ''; # Do not, under any circumstances, change this routine unless # you know exactly what you are doing and why. If you're not # sure why I said that, then you don't know what you're doing if ( exists $paths{ $tainted_path } ) { ( $clean_path ) = ( $tainted_path =~ /^(.*)$/ ); } else { # whups! Can't find it. Here's where we do the error handling } # $clean_path is now untainted and safe to use

Note: the unused hash values are provided in case someone needs "access level" control over the directories.

What's the strength of the above code? It's not possible for the user to enter a path that you do not want him or her to have access to. What's the weakness (aside from the dot star untainting that just gives me the willies)? This code is no longer orthogonal. If we need to add a new directory somewhere, I need to update my code. I could write the directories to a config file and have this (and other programs) read from it. That will make the maintenance a bit easier, but then I'm relying on something outside of the program to ensure security: not good.

I really don't want my code to be a maintenance headache. However, we've heard many times that security and convenience are in inverse relationship to one another. Is this an example? Have I missed a better way of approaching this problem?

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re (tilly) 1: Orthogonal Code and Security
by tilly (Archbishop) on May 02, 2001 at 23:29 UTC
    In most security models the statement I am doing something allowable! is orthogonal to the statement I find myself able to do this! While the two pieces of information look similar, they are not and so you find yourself wondering why you are repeating things.

    Remember that orthogonality means two things. First that you do not repeat yourself. Secondly that you do not intertwine things that are only accidentally connected to each other.

    So your approach is, in my eyes, orthogonal. What I would add to make it more orthogonal is that the hash should not hold an access level, rather have the key be what appears in the URL and the value be the corresponding path in your system. That way you are no longer advertising information about your directory structure, and if you ever changed your directory structure you would be able to avoid breaking old URLs. (Or you could make it work from two different URLs.)

    Incidentally note how I said in most security models. At least one model I respect (on theoretical grounds at least) takes the approach that the only way you can try to do things is simultaneously how you validate that you can do it. Conceptually this is like saying that you have no need to validate whether someone can open the file that they are requesting to open because the only way they can name that file is through the directory handle, and the directory handle only shows them files they are allowed to open.

    If that is confusing, read the introductory essays there on capabilities (which are NOT to be confused with POSIX "capabilities"). Then re-read it comparing in your mind exactly how an OO system protects its internal data structures...

Re: Orthogonal Code and Security
by Masem (Monsignor) on May 02, 2001 at 22:12 UTC
    Would not the .htaccess approach from Apache work in this specific situation? Namely, by default, assume that a directory to be written to is blocked, but by the introduction of a dot file that the code can detect, you can either simply say that existence is merely enough to ensure access, or set up some complicated language that can determine security levels on the fly? In this case, you are moving the security configuration out of the program and into the file space.

    But I'm sort of confused as to what you are considering orthogonal and secure. In your example, you have a list of dirs, and you say that this isn't orthogonal because when you move the dir tree or add a new dir, you also have to update the code, so that's two placed to change. But on the other hand, you say that moving any security out from the code and elsewhere is insecure. I don't see how anything that operates at the file level cannot do the latter. You have to decide where you are going to pack the security features; if you do it in perl, you lose orthogonality, if you do it on the file system, you lose 'security' by your thoughts (please correct me if I'm wrong).

    IMO, the best way to solve the problem above is to use a hash to point the keys to the directories that may be written to; the keys are the only things sent via CGI, while the values (untainted) and the only things used to create or write files. Your security configuration is in perl, but as you claim, this is not necessarily orthogonal.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain

      Masem wrote:

      You have to decide where you are going to pack the security features; if you do it in perl, you lose orthogonality, if you do it on the file system, you lose 'security' by your thoughts (please correct me if I'm wrong).

      You understood me correctly. My thoughts on security are pretty simple: if they're out to get you, paranoia makes good sense.

      Who are 'they'? Anyone who's not Ovid. What does 'out to get you' mean? Doing anything which may affect my code and what it does, whether or not it's intentional. I try to be ultra-paranoid when I write a CGI script. As a result, I don't want to trust anything outside of my code. Of course, I can't stop another programmer from going in an changing my code, but I don't want my code to be dependant upon what others have done. I suppose it's just finding a balance between what's reasonable and what's so frickin' stupid (perhaps my code?) that I deserve to be shot for writing it. Realizing that operating systems often have a certain level of security built into them, perhaps I can rely on this. On the other hand, I have a significant weakness in that I don't understand much about the operating systems themselves, or how they are set up. I tend not to want to rely on something that I don't know well.

      Hmm... obviously I need to learn more about this area.

      Cheers,
      Ovid

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        One of the things that I thought about this is that generally, security and orthogonality cannot coexist, or better stated: (level of security) x (level of orthogonality) = constant. A more secure system is going to have multiple checkpoints, which will defeat the purpose of orthgonality (though not the ability of code reuse). Orthogonal code will be less secure because of the lack of interaction between security mechanism.

        Go back to the car example: most automatics will not let you start the engine if the car is not in park, which means any orthogonality between the ignition and the transmission is gone. But, because of this, there is an increase in 'security' in that you won't be stripping gears or wearing the engine because of improper usage of either system.

        So the problem of security vs orthogonality comes down to the programmer (and possibly job description) to decide where they want to be; in your (Ovid's) case, you want to be ultra-strict, which means multiple checks, which means that systems need to be less orthogonal to make everything work. Someone else might feel that because of multiple programmers that are working on a project, orthogonality is an absolute must, and security measures may fall due to this. There's no right or wrong answer to "how secure is secure" or "how orthogonal is orthogonal"; it's all going to depend where the emphasis is to be placed by the programmer.

        So when faced with problems like these, the best way to approach them is exactly how your test case did: you ask a number of people for opinions: since everyone has a different idea where security/orthogonality should be, you'll get a number of solutions that sit along the curve given above, and you can choose a solution that is near the point where you desire.


        Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      While more flexible, the .htaccess approach is intrinsically riskier than Ovid's, because you must pass a directory or file derived from untainted user data to a system call (ie. opendir, or open). This means you have to carefully consider all classes of valid input versus invalid and potentially dangerous input. From a security perspective, it's preferable to have a simple mapping of user input to prearranged file locations.
         MeowChow                                   
                     s aamecha.s a..a\u$&owag.print
        Hmm, I saw that suggestion working in a different manner: searching directories (either on the fly or updating at intervals depending on number of directories and how rapidly the system needs to respond to changes), and building a paths data structure from that, which could then be used to verify the user-input as in Ovid's example.
Re: Orthogonal Code and Security
by eejack (Hermit) on May 03, 2001 at 07:59 UTC
    I don't know about better, but perhaps different.

    First, you have to acknowledge security comes with a cost, whether it is less maintainable code, or speed or pick some other poison.

    Often times we write things to help us maintain systems, servers..whatever. But I have found myself recently writing scripts to maintain my scripts. If that sounds a bit weird it may be. For example, a different approach to your above example would be to put your hash info into a database or flatfile - whatever that can having differing permissions.

    Using mysql as an example, you can have one user - let's call this user script, have read only permissions on your configuration database table. If someone can see your source, they can find a way to read from the database (same as seeing your hash). You sacrifice a bit of speed making that call to the database - perhaps a lot of speed in some cases.

    However, you can then in turn write another script, using a different user, let's call this user ovid, who has write permissions. This script can check to see if the directories exist, if not create them, set up proper permissions, and put the proper information into the database.

    So, if you need to add something to your *script*, you simply run the *ovid* which sets things up for you.

    Costs: original setup *may* be longer, speed *may* be slower
    Gains: security is tight, script has a predictable and maintainable method.

    Not better, but perhaps different enough.

    EEjack