PerlScholar has asked for the wisdom of the Perl Monks concerning the following question:

Apologies for the poor formatting on my previous post and thank you all for your input. So I am creating arrays to store the checksum of files and I do this twice as I have 2 directories. I am then trying to compare the arrays using the last bit of code. This is when the arrays go out of scope:

for my $entry(@files1) {
my $readfile = "$dir1\\$entry";
#print "$readfile\n";

if (-e $readfile && -r $readfile){
open (FILE, $readfile) or die "Error opening: '$readfile': $!";
binmode(FILE);

my $md5 = Digest::MD5->new;
$md5-> b64digest;

while (<FILE>) {
$md5->add($_);
}

my @digest1 = $md5->b64digest;
#print "@digest1\n";#This array will go out of scope
print "$readfile: @digest1\n";
}
close (FILE);
}        print "\n";

#2nd directory
for my $entry(@files2) {
my $readfile = "$dir2\\$entry";
#print "$readfile\n";

if (-e $readfile && -r $readfile){
open (FILE, $readfile) or die "Error opening: '$readfile': $!";
binmode(FILE);

my $md5 = Digest::MD5->new;
$md5-> b64digest;

while (<FILE>) {
$md5->add($_);
}
my @digest2 = $md5->b64digest;#This array will go out of scope
print "$readfile: @digest2\n";
}
close (FILE);
    }

#compare directories
my $equals = 1;
foreach (my $i = 0; $i <@digest1; $i++) {
    if (my $digest1[$i] ne my $digest2[$i]){
$equals = 0;
}
else {
$equals = 1;
     }
print "$equals";
}

Error messages

Global symbol "@digest1" requires explicit package name at Z:/My Documents/Workspace/DeployChecker/Test.pl line 101.
syntax error at Z:/My Documents/Workspace/DeployChecker/Test.pl line 102, near "$digest1["
syntax error at Z:/My Documents/Workspace/DeployChecker/Test.pl line 105, near "else"
Execution of Z:/My Documents/Workspace/DeployChecker/Test.pl aborted due to compilation errors.

Replies are listed 'Best First'.
Re: Global variables question
by jwkrahn (Abbot) on Aug 24, 2010 at 02:06 UTC
    my @files1 = grep {/\w/} readdir(DIR1); my @files2 = grep {/\w/} readdir(DIR2);

    When you use readdir to get the file names, you only get the file names.    To access the actual files you have to prepend the directory name to the beginning of the file name otherwise you are trying to access the files in the current directory.

Re: Global variables question
by dasgar (Priest) on Aug 23, 2010 at 23:26 UTC

    First, please put your code inside <c></c> tags to make it easier for others to read. (For an example, see the code in this post.)

    Second, can you specify which array(s) you believe to go out of scope and share what kind of error/warning messages you are getting? Without this information, it makes it a bit challenging to figure out which of the several arrays you're having problems with. Since you're using both strict and warnings, you should be getting some useful error/warning messages that will help in debugging the code.

    Also, I believe that the last little bit of code at the end of your post has some incorrect syntax. It probably should be rewritten to be more like the untested code below.

    use warnings; use strict; my $equals = 1; for (my $i=0;$i<$#array1;$i++) { if ($array1[$i] ne $array2[$i]) {$equals = 0;} else {$equals = 1;} print $equals; }

    Of course someone more familiar with the ternary operator than I am will probably point out that the if statement can be shortened to a single line. I think it's would be something like:

    $equals = (($array1[$i] ne $array2[$i]) ? 0 : 1);
Re: Global variables question
by JavaFan (Canon) on Aug 24, 2010 at 10:20 UTC
    If you indent your code properly, it should be easy to see in which block @digest1 is declared, and in which blocks it's used. The mismatch should be clear. But it's very hard to understand your code as it's presented now.

    Furthermore, my $digest1[$i] is a syntax error. You cannot declare an array element as a lexical variable.

      Hi thanks for your reply. I can see the mismatch but the problem is that when I change @digest1 to global variable I still have the same problem. I tried to resolve this by removing @digest1 from the inside the loop but that also gave an error:

      Can't call method "b64digest" on an undefined value at Z:/My Documents/Workspace/DeployChecker/Test.pl line 50.

      Any ideas? Many thanks!

Re: Global variables question
by Marshall (Canon) on Aug 24, 2010 at 12:13 UTC
    Your code is not indented properly and that makes it hard to follow. I personally prefer the newer style of putting the left braces on the next line instead of at an end of line. Your preference may vary. But no matter which approach you use, the indentation levels should be easy to see and this does make a difference!

    I think you are relatively new to this, so I will explain my thinking while writing some code for you...

    This is the first time that I'd used the MD5 check sum thing. I downloaded and installed the MD5 module and then considered what is the easiest format to use? I figured that hex was. I did a little hacking to see that this produced what I thought that it would. Not to say that base64 is "bad", use what you want all that matters here that you can generate a repeatable string for a file and that is is very unlikely for 2 different files to have the same string.

    The next step was to figure out the right data structure. Basically we need file name (which is unique in a directory) and the fancy check sum (MD5)for that file which will be a single string consisting of hex values. A hash table instead of an array seemed appropriate.

    Then I made a subroutine that takes pathname and returns the hash of name=>checksum. There are some ways of passing back the return values more efficiently, but here it didn't seem to matter.

    I'm not sure what the comparison rules are. If say dirA is considered the "master dir" and you want to know if any files changed in dirB, loop over names in dirA and see if corresponding checksum in dirB matches. Things get more complex if you want to know if any "additional" files are in B that are not in A. Rather than write code, I leave it to you do decide what you need.

    #!/usr/bin/perl -w use strict; use Digest::MD5 qw(md5 md5_hex md5_base64); use Data::Dumper; $| =1; #turns off output buffering, useful for debugging #all of these are possible # $digest = md5($data); # $digest = md5_hex($data); #make it easy , use this! # $digest = md5_base64($data); my %dirA = get_chksums("."); #### put real dir name here, #### not "."(current directory) print Dumper \%dirA; my %dirB = get_chksums("."); #### put real dir name here print Dumper \%dirB; ##### ### put some comparison stuff here ##### sub get_chksums { my $path = shift; my %file2cksum; opendir (INDIR, $path) || die "unable to open $path"; my @files = grep {-f "$path/$_"} readdir INDIR; close INDIR; foreach my $file (@files) { open (IN, '<', "$path/$file") || die "unable to open $path/$file"; $file2cksum{$file} = md5_hex(<IN>); # print "$file $file2cksum{$file}\n"; #for debugging... close IN; } return %file2cksum; }
      Hi Marshall,

      Thank you for your detailed response very well explained. As you can tell i'm quite new to this so I will work on my indenting to make my code clearer in the future. Just for my understanding... is there a difference betwen a hash and a 2D array or is a hash table more efficient? Also for the compare part I want to compare the files as you described in the last part of your answer. Would something like this be on the right lines?

      my $found = 0;
      foreach my $key1 (keys %hash1) {
      foreach my $key2 (keys %hash2) {
      if ($hash1{$key1} eq $hash2{$key2})

      {
      $found=1;
      }
      print "$found";
      }
      }

        I figure you still have some indenting work to do. This is very important. Judiciously applied white space is one of the very most important things that you can do to improve readability of your code.

        Untested, but I figure close to what you want...test, experiment, move forward with the advice you've gotten so far...

        my $num_errors = 0; foreach my $file (keys %hash1) { if (!exists ($hash2{$file}) ) { print "file: $file doesn't exist in 2nd directory\n"; } elsif ($hash1{$file} ne $hash2{$file}) { print "md5 didn't match for $file\n"; # meaning that file in 2nd directory is not the # same as the file in 1st directory $num_errors++; } } print "total errors = $num_errors\n";
        Perl "sees" something akin to this (below)... a bit harder to understand than the above?
        white space is important, variable names are important.
        I called my hashes %dirA and %dirB instead of %hash1 and %hash2 for a reason!

        %x is a hash but "x" has no contextual meaning! %dirA is a hash of file names in directory A to checksums.
        Even %dirA_files_to_checksums would be wayyyyyy better than %hash1. I guess %dir1 is also ok. The % means hash - give more contextual information!

        my $num_errors=0;foreach my $file (keys %hash1){if (!exists ($hash2{$f +ile})){print "file: $file doesn't exist in 2nd directory\n";}elsif ($ +hash1{$file} ne $hash2{$file}){print "md5 didn't match for $file\n";$ +num_errors++;}}print "total errors = $num_errors\n";
        PS: Yes, a hash tables for this purpose is going to be WAY more efficient than an array.
Re: Global variables question
by Marshall (Canon) on Aug 24, 2010 at 03:59 UTC
    See post by jwkrahn, in addition, I think your grep may not be doing what you think.
    opendir( DIR1, $dir1 ) or die("Error opening: $dir1"); my @files1 = sort grep {-f "$dir1/$_"} readdir(DIR1);
    may work better for you. This will filter out any directories. Also, I don't think that the files returned by readdir are guaranteed to be in any order, so if you sort, that will get things into a predictable, repeatable order.