OfficeLinebacker has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, esteemed monks!

I need to make sure that two directory trees are identical in terms of structure, but in name only (ie, list of filenames is identical, but the files in "mirrored" dir will be variations on the theme of the files in the "master" directory).

Is this just a simple File::Find thing with two calls to find() where I would push each $File::Find::name value onto the corresonding array, and then a simple equality test at the end?

At first all I care about is just getting notified if the mirror is off. Then I'd like to build up to outputting a list of the differences, then to actually fixing the differences (rsync-style).

There's a twist though; there is one subdirectory in the "master" tree that we will likely NOT want mirrored over. I imagine I could take care of that with a conditional in the &wanted sub, but what do you think? Maybe add it as an argument to the program in case the list of non-mirrored dirs grows? I don't know how likely that is.

Let me know if code/pseudocode would help, and if the general premise isn't as clear as I intended.

Terrence

_________________________________________________________________________________

I like computer programming because it's like Legos for the mind.

  • Comment on How to ensure duplicate directory and file tree, but not file contents themselves?

Replies are listed 'Best First'.
Re: How to ensure duplicate directory and file tree, but not file contents themselves?
by wojtyk (Friar) on Aug 24, 2006 at 15:57 UTC
    Assuming you're on a UNIX box, you don't need Perl at all.

    The UNIX diff utility already contains that exact function. When presented with two directories as arguments, it compares filenames (ignoring contents) and reports the differences.

    If you really wanted to do it in Perl, however...I would do what you suggested, but with one small modification. Instead of building two arrays and iterating them both at the end for equality, just create a hash of all the filenames on the first run of find...then check "on-the-fly" during the second run. It's more efficient.

Re: How to ensure duplicate directory and file tree, but not file contents themselves?
by andyford (Curate) on Aug 24, 2006 at 16:16 UTC
    If you can use rsync itself, you could try the "--dry-run" option to get just your list. In fact, rsync might can do everything you need using its exclude and include options.

    andyford
    or non-Perl: Andy Ford

Re: How to ensure duplicate directory and file tree, but not file contents themselves?
by planetscape (Chancellor) on Aug 25, 2006 at 01:10 UTC

    While this recent thread talks about FTP, I believe there are still ideas within that may be useful to you.

    HTH,

    planetscape
Re: How to ensure duplicate directory and file tree, but not file contents themselves?
by zentara (Cardinal) on Aug 25, 2006 at 13:33 UTC
    Another idea would be File::Slurp::Tree, although getting the remote hash would require some serialization. But it will put the Trees into hash structures, which would allow you to use all of the hash comparison techniques available.
    #!/usr/bin/perl use warnings; use strict; use File::Slurp::Tree; # The tree datastructure is a hash of hashes. The keys of # each hash are names of directories or files. Directories # have hash references as their value, files have a scalar # which holds the contents of the file. my $dir = shift || '.'; my %tree; my $tree = slurp_tree($dir); my $depth = 0; print "$dir\n"; print_keys($tree); sub print_keys { my $href = shift; $depth++; foreach ( keys %$href ) { print ' ' x $depth, "--$_\n"; print_keys( $href->{$_} ) if ref $href->{$_} eq 'HASH'; } $depth--; }

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: How to ensure duplicate directory and file tree, but not file contents themselves?
by OfficeLinebacker (Chaplain) on Aug 24, 2006 at 22:06 UTC
    ++you guys! Thanks, you've given me some good avenues to pursue!

    T.

    _________________________________________________________________________________

    I like computer programming because it's like Legos for the mind.