If you are seeking to determine whether files on host.A are identical to files on host.B, create a single process that will run separately on each host, use File::Find to locate all data files, use Digest::MD5 to create an MD5 signature (checksum) for each file, and output a list of file names and checksums.

Then write a separate script (much simpler) to compare the lists of file names and checksums from the two hosts, to report (1) files on A not on B, (2) files on B not on A, and (3) files with same name on each host but different content.

Bear in mind that the only reason you would need to write a perl script to do this is so you could do it easily on the windows machines. The standard tools on any unix box are already on hand to do it all with a couple simple command lines:

# on host.A: find /base/dir/path -type f -print0 | xargs -0 md5 > host.A.checksum.l +ist # and likewise on host.B, then put both list files in one place and tr +y: diff host.A.checksum.list host.B.checksum.list

Actually, these tools (as well as a couple different Bourne-like shells, bash and ksh) have been ported to Windows (look for Cygwin, AT&T Research Labs "UWIN", maybe others), so you could do shell commands like the ones above on all your machines.

In case diff makes the list comparison a bit too opaque for you, I posted a handy list-compare utility that might help with the last step, and posted it here: cmpcol.

(update: It may be that I have misread your post. For the case of the same file name existing on two hosts, the plan I suggested will only report whether they are identical or not. If you actually want to describe the nature of differences, this plan will at least tell you which files need to be inspected in closer detail, which will save you a lot of time and trouble. Ideally, you would have just a few pairs of files that need to be fetched into a common location, and you can use "diff" on them, or whatever.)


In reply to Re: Generic compare script. by graff
in thread Generic compare script. by TeraMarv

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.