Re: Generic compare script.
by graff (Chancellor) on Oct 12, 2005 at 05:39 UTC
|
If you are seeking to determine whether files on host.A are identical to files on host.B, create a single process that will run separately on each host, use File::Find to locate all data files, use Digest::MD5 to create an MD5 signature (checksum) for each file, and output a list of file names and checksums.
Then write a separate script (much simpler) to compare the lists of file names and checksums from the two hosts, to report (1) files on A not on B, (2) files on B not on A, and (3) files with same name on each host but different content.
Bear in mind that the only reason you would need to write a perl script to do this is so you could do it easily on the windows machines. The standard tools on any unix box are already on hand to do it all with a couple simple command lines:
# on host.A:
find /base/dir/path -type f -print0 | xargs -0 md5 > host.A.checksum.l
+ist
# and likewise on host.B, then put both list files in one place and tr
+y:
diff host.A.checksum.list host.B.checksum.list
Actually, these tools (as well as a couple different Bourne-like shells, bash and ksh) have been ported to Windows (look for Cygwin, AT&T Research Labs "UWIN", maybe others), so you could do shell commands like the ones above on all your machines.
In case diff makes the list comparison a bit too opaque for you, I posted a handy list-compare utility that might help with the last step, and posted it here: cmpcol.
(update: It may be that I have misread your post. For the case of the same file name existing on two hosts, the plan I suggested will only report whether they are identical or not. If you actually want to describe the nature of differences, this plan will at least tell you which files need to be inspected in closer detail, which will save you a lot of time and trouble. Ideally, you would have just a few pairs of files that need to be fetched into a common location, and you can use "diff" on them, or whatever.) | [reply] [d/l] |
|
|
| [reply] |
|
|
my $find = "find /path -type f -print0 | xargs -0 md5"
for my $host ( @hostlist ) {
open O ">$host.md5list" or die "$host.md5list: $!";
print O `ssh $host '$find'`;
close O;
}
# compare lists here, if you like, or use a separate script/tool to do
+ that
That assumes that you have the appropriate authentication keys for using ssh without a password to connect to each host. Other methods are possible for the connections, of course.
(updated the script to include "xargs -0", and to run the md5 part on the remote host, where it belongs -- note the single quotes around $find in the ssh command line.)
(another update: I should confess that I have no clue how you would actually execute a shell script on a remote windows machine... good luck with that.) | [reply] [d/l] |
|
|
|
|
Re: Generic compare script.
by pg (Canon) on Oct 12, 2005 at 04:30 UTC
|
You can try File::Compare.
But I myself also did something for the same purpose, and what I did was simply use DOS fc command to compare files. I also use File::Glob to get the list of files in particular directory, then loop through each file and execute fc through system(). The results can be captured through redirection:
perl -w blah.pl > result
| [reply] [d/l] |
|
|
I have used File::Compare in the past but it doesn't exactly do the job here. I have files on remote machines...I don't want to have ftp them both to the local machine.
I was thinking of telneting into the remote machines and running the following:
perl -MFile::Find -e "find(sub{print qq($File::Find::name\n);},q($path))"
and capturing the output to get my list of files.
Then using telnet again to 'cat' the files and capturing the output from both machine and comparing that....which I suppose is just like using ftp in the end.
| [reply] |
|
|
There must be a better way!!!!!?
| [reply] |
Re: Generic compare script.
by GrandFather (Saint) on Oct 12, 2005 at 04:24 UTC
|
Where do you need help with this? Have you looked at Text::Diff for example. Do you intend a traditional diff type text output, or is the real effort going into some sort of side by side GUI view of the diffs?
Comparing *nix and Windows files need not be a big issue. Just massage the line ends to be compatible before you diff, or use a diff that is insensitive to such things.
Note that there are many GUI diff tools available already - why do you need to reinvent this particular wheel?
Perl is Huffman encoded by design.
| [reply] |
|
|
| [reply] |
|
|
| [reply] |
Re: Generic compare script.
by adrianh (Chancellor) on Oct 12, 2005 at 08:46 UTC
|
I have been tasked with writing a compare script to hilight any inconsitencies between files in production and development environments.
I'd just use a source control system like Subversion, SVK or even, if I was forced, CVS. Check in both directories and do a diff.
| [reply] |
Re: Generic compare script.
by Perl Mouse (Chaplain) on Oct 12, 2005 at 09:15 UTC
|
I'd start with running 'rsync' with the '--dry-run' option. This will give me a list of files that differ. Then, if I wanted to know the differences between specific files, I'd run 'diff'.
| [reply] |
Re: Generic compare script.
by pajout (Curate) on Oct 12, 2005 at 08:01 UTC
|
I am not experienced with Win tools 5 years ago, but, if you can use in that OS *nix-like diff or Emacs'es ediff for visual resolving of differences, you could be satisfied. | [reply] |
|
|
Pajout, you are slightly unpointed at the early morning...
So, I am sorry for my previous message.
| [reply] |