in reply to Finding out whether two directories are the same

This is basically the same philosophical problem as determining whether two URLs point to "the same page". You can't really tell.

You can check the device and inode field of the results of stat for both directories, but that will only work for file systems that follow the idea the inventors originally had. This likely won't detect file systems that are mounted in two separate places into the same directory tree and won't work where the inode field is always empty or zero.

You can readdir the directories and consider them to be the same if they contain the same files. Possibly you can also check and compare the sizes and timestamps of all directory entries.

But in the end, you can never be exactly sure.

Replies are listed 'Best First'.
Re^2: Finding out whether two directories are the same (insert)
by tye (Sage) on Aug 28, 2008 at 09:08 UTC

    Thanks for filling in those details, Corion. Now I can just skip to the crazy ideas:

    my $rand= md5_hash_hex( rand() . $$ . $x . "super secret" ); mkdir( "$x/$rand" ); if( -d "$y/$rand" ) { warn "$x == $y\n"; } rmdir( "$x/$rand" );

    I only created a subdirectory instead of a file because it makes for more concise Perl code. Or you could lock a file you find in $x and see if you hold the lock on that file in $y? (But that assumes that all of your file system exporting technologies convey lock information, which is not always the case, of course.) You could create a pipe in one... etc.

    - tye        

      Or you could lock a file you find in $x and see if you hold the lock on that file in $y

      Or, then, I could create a tempfile (File::Temp) in one, and see whether the tempfile exists in the other!!!

      -- 
      Ronald Fischer <ynnor@mm.st>

        rovf++ - this may well be the most portable idea of all. There is a very small chance of failure in weird setups like AFS, in which synchronization between the exported and the local dir may happen days apart, but that's a real stretch.

        
        -- 
        Human history becomes more and more a race between education and catastrophe. -- HG Wells
        

        You should not forget to check, if the file creation was successful. If you don't have permission to create the file and you don't check that, your check for existence might lead to a wrong conclusion!

Re^2: Finding out whether two directories are the same
by rovf (Priest) on Aug 28, 2008 at 09:41 UTC
    You can check the device and inode field of the results of stat for both directories, but that will only work for file systems that follow the idea the inventors originally had.

    I guess this means "the usual file systems on Unix and Linux". Do you happen to know to what extent this is fulfilled for networked file systems on Windows?

    You can readdir the directories and consider them to be the same if they contain the same files.
    In my case, this would mean I would have to actually compare the content of the files, because even the sizes could be the same :-(
    Thanks a lot for clarification.

    Update: I just made a test on Windows, and the inode number indeed comes up as zero.
    -- 
    Ronald Fischer <ynnor@mm.st>

      You might take a gander at how rsync for inspiration. Or maybe use rsync -n and see if it thinks there's a difference.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        Testing for identical content (whether using rsync or something else) will not give you a 100% sure answer. Even if the content is the same, the directories do not have to, for instance:
        $ cp -a dir1 dir2
        will create a second directory with identical content, but dir1 and dir2 are different. You'd have to look at inode change time to see a difference, but even then, it's not a garantee.

        OTOH, even if you think the content is different, the directories may be the same. If the directories are the same, you'll be looking at the same directory twice, but since you cannot garantee you'll be looking twice at the same time, the directory, or any of its files inside it, may have changed between your two inspections.

        Having said that, I don't think there's solution that will always work anywhere. But usually you will know a bit of the environment. If you know your filesystems are local, and no device is mounted twice, checking inode and device number will do. If you know the directories to be tested aren't being modified while you make your check, checking for content may work.

        As far I see, rsync does not run on native Windows (Cygwin is not an option here) :-(

        -- 
        Ronald Fischer <ynnor@mm.st>