http://qs1969.pair.com?node_id=292660

kilinrax has asked for the wisdom of the Perl Monks concerning the following question:

I have a script which uses File::Temp and File::Copy to create a temporary copy of a file to read from. The file in question is periodically rewritten by a Java process, and it's imperative that I get a complete version.

My concern is the conflicting warning regarding correct usage of the two modules.
From File::Copy:
Note that passing in files as handles instead of names may lead to loss of information on some operating systems; it is recommended that you use file names whenever possible.
And from File::Temp:
For maximum security, endeavour always to avoid ever looking at, touching, or even imputing the existence of the filename. You do not know that that filename is connected to the same file as the handle you have, and attempts to check this can only trigger more race conditions. It's far more secure to use the filehandle alone and dispense with the filename altogether.
My question is; which of these warnings should take precedence? In this particular case, should I be using filehandles or names? And more generally, how do you make judgement calls between conflicting usage warnings in situations like these?
  • Comment on Conflicting usage recommendations: File::Temp and File::Copy

Replies are listed 'Best First'.
Re: Conflicting usage recommendations: File::Temp and File::Copy
by Abigail-II (Bishop) on Sep 19, 2003 at 14:37 UTC
    It all depends on what you are doing, and what is happening with the file. The warning from File::Temp points to the fact that after you have opened a file, something else may have (re)moved the file, and created another file with the same name as the original one. I think the remark in File::Copy points out that it's possible a file handle might not be opened with options that are appropriate - in the next sentences, it stresses the importance of having opened the file in "binmode" (on certain platforms).

    Abigail

Re: Conflicting usage recommendations: File::Temp and File::Copy
by Steve_p (Priest) on Sep 19, 2003 at 14:54 UTC

    I think you should take a closer look as to how these modules work. I haven't worked with File::Temp in a while, but, if I remember correctly, it's purpose was to create a file with a random or semi-random name that would go away the second the program is completed. The module considers the filename to be mostly irrelevant because it doesn't know the filename when the program starts. File::Copy, however, works mainly under the assumption that you know what the filename is.

    The warning in File::Temp is mainly for security purposes, so a hacker can't be just watching for the existance of a particular file, and attack. For example, if you had a Perl program to do a mass insert into the passwd file on UNIX, it wouldn't be to good to have a fixed file name that a hacker could intercept and add to during an update. It's also to prevent race conditions where you would have multiple process potentially creating a similar temp files at the same time. I'm not a security expert, but if you have one singe-threaded process copying a single file to a temporary location, I cannot see where using the filename from File::Temp would be a problem.

Re: Conflicting usage recommendations: File::Temp and File::Copy
by simonm (Vicar) on Sep 19, 2003 at 14:50 UTC
    From the File::Copy documentation, it look like the "some operating systems" they're talking about include choices like Windows, OS/2, and VMS.

    If you're using a Unix-equivalent, I would guess that this does not apply to you, and that the warning from File::Temp should supersede it. (But that's just a guess.)

Re: Conflicting usage recommendations: File::Temp and File::Copy
by herveus (Prior) on Sep 19, 2003 at 14:53 UTC
    Howdy!

    File::Temp is about the secure creation of temp files. It gives you back a file handle to the opened file and the file name. If you depend solely on the file name, you could get burned; the file handle points to the actual file File::Temp created.

    With File::Copy, you are normally going to be referencing files by name, so converting the name to a file handle carries different hazards.

    Consider the two warnings in the context of their respective modules. They are not contradictory unless you remove them from their contexts.

    yours,
    Michael

      I fear you misinterpret my use of the word 'contadictory'. The warnings are contradictory merely in the context of attempting to use the modules together in the way I describe; under such circumstances you cannot heed both.
        Howdy!

        True...given your particular usage, you can't slavishly heed both, but you can make a mature decision which warning is superfluous in your specific situation. (did I use enough big words? :))

        yours,
        Michael

Re: Conflicting usage recommendations: File::Temp and File::Copy
by John M. Dlugosz (Monsignor) on Sep 19, 2003 at 16:32 UTC
    If you consistantly don't use the name at all, then nothing will be done to the file that can't be done using the handle alone (e.g. open a secondary data stream). In that case, Copying the file via the handle should work properly, too.

    However, File::Copy might not copy security attributes or other things if given handle only; that might specifically be asking to copy only the file content. Conceptually, anything done to the original by knowing its handle only can be done to the copy as well, supposing that there are "get" calls to match all "set" calls on handles.

Re: Conflicting usage recommendations: File::Temp and File::Copy
by barbie (Deacon) on Sep 19, 2003 at 17:20 UTC
    I'm assuming that the Java process is writing to the file you want to copy from. In this instance, how you access File::Temp is largely down to your preference, as only your process is using it and the file should have be created with a unique inode.

    However, the File::Copy process is likely to get clobbered at some point, unless the Java does an atomic update. The Perl example is IO::AtomicFile. If an atomic update is used, then using filehandles will ensure you have a complete file. If this Java process doesn't use a atomic update method, then it's possible you will copy an incomplete or even corrupt file.

    As others have mentioned, the context of how you use these modules, dictates whether the lines you quoted conflict. In this instance it may be safer to read from a filename and write to a filehandle. YMMV.

    --
    Barbie | Birmingham Perl Mongers | http://birmingham.pm.org/

      The code in question does check that the file isn't being accessed when it does the copy, and checks that the copy gets the whole file afterwards - so while I'd like to minimise the risk of such occurances, they're not disastrous unless they start occurring so often the script hits its retry limit and bails out.