in reply to Re: List Duplicate Files in a given directory
in thread List Duplicate Files in a given directory

Thanks Huck and Keybot . From your solution, It is clear to use md5dum's as the keys of the hash and then hash value as a reference to an array of files . I have changed my original code to create a Hash of an Array to do this.

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; ############## my $dir = "$ARGV[0]"; my %md5sum; opendir(my $dh, $dir) || die "Unable to Open the Directory: $!\n"; chdir $dir or die "Cannot Change directory: $!\n"; while (my $file = readdir $dh) { chomp $file; next if $file =~ /^\.{1,2}$/g; if (-f $file) { my ($md) = (split /\s+/, qx(/usr/bin/md5sum $file))[0]; if (exists $md5sum{$md}) { push @{$md5sum{$md}}, $file; } else { push @{$md5sum{$md}}, $file; } } } closedir($dh); foreach my $ky (keys %md5sum) { if (scalar( @{$md5sum{$ky}}) == 1) { print "Unique File: @{$md5sum{$ky}} , Md5sum: $ky\n"; } else { print "Duplicate Files: @{$md5sum{$ky}}, Md5sum: $ky\n"; } }
-bash-3.2$ ./duplicate_files.pl directory Duplicate Files: file4 file2 file3, Md5sum: d41d8cd98f00b204e9800998e +cf8427e Unique File: file6 , Md5sum: d617c2deabd27ff86ca9825b2e7578d4 Duplicate Files: file1 file5, Md5sum: 5bb062356cddb5d2c0ef41eb2660cb0 +6