in reply to Re: Untainting 'bad' filenames
in thread Untainting 'bad' filenames

The program knows the filename, but I can't anticipate all permutations in advance (so I can write a tight regex) and a filename from a readdir must be (I'm pretty sure) untainted before I can move it (assuming I have Taint warnings turned on).

But like I mentioned above, since the directory I'm reading from is supposed to be reasonably secure, if I start encountering really weird filenames, I have bigger problems than just untainting.

thanks

Replies are listed 'Best First'.
Re: Re: Re: Untainting 'bad' filenames
by PsychoSpunk (Hermit) on Dec 08, 2000 at 22:27 UTC
    The program knows the filename, but I can't anticipate all permutations in advance

    I'm not sure about the taint feature in this respect, so I'll refrain from commenting further on that issue. But with the filename, when I say you do know the filename, it is in relation to the script. IOW, your spec seems to state that you do know the format of the filename, but not the filename.

    But in order to compare the filename to a regex, you (the script is simply an extension of you; be the script :) have to know the filename. The regex shouldn't check all permutations of the name. It should check valid permutations.

    In which case, you can write a very tight regex, since it is based on your valid filename. I think you're taking too many variables into account here with the solution of your problem. I see a single variable: the filename, and a single control: the format the filename should match. This makes it a very binary operation. It matches or it doesn't. What is it that I'm missing in this discussion? (This is purely discussion, since it seems as though someone may have provided a solution that you will use.) I'm interested in case I ever see this problem myself.

    ALL HAIL BRAK!!!

      Psychospunk, what you're missing is that doran is asking how to deal with bad filenames, that is, filenames that don't fit the specified format. How should he untaint those filenames, which are in an unknown format, so that he can safely pass the filenames to the rename() function?
        Oh, but I was under the impression that he was simply moving the file if it didn't match and technically not renaming it. But, I do see how inspecting the file would be made more difficult if you don't know the format.

        Wouldn't it be easily accomplished (the rename()) if he stripped any characters from the filename that would cause issues? I guess what I'm failing to see is what happens to bad filenames after they're moved. I'm looking at the problem as if we have good filenames and bad filenames. If it's not good, I need to move the file elsewhere. Thus, I need to know if the filename has any characters that would cause the rename function to explode. After that, I would simply use another script to inspect the internals of the files considered bad.

        The environment seems to be controlled, in the sense that both directories are only accessible to "trusted users". I may be wrong about that. But if that's the case, then what is the difference between inspection before moving and inspection afterwards. I'm battling this out since I want to know why the previously suggested way of checking the file is better than this idea of having a second script check the bad files.

        ALL HAIL BRAK!!!