Unique Hash name for each pass

Xhings has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am reading multiple files and want to put contents of each file in a single hash in order to compare the hashes for uniqueness.Idea is to get unique files. Following is my code:

my $flag;
my $x = 1;

print "Tell me dude/dudette where are the log files: " , " \n";
$Location1 = <STDIN>;
chomp $Location1;

print "And where would you like to put the processed results: " , "\n"
+;
$Location2 = <STDIN>;
chomp $Location2;

$UserInputDir = $Location1;
opendir DH, $UserInputDir or die "Cannot open $UserInputDir: $!";
@files = readdir DH;
closedir DH;

foreach $file (@files) {
        next if $file=~/^\./;
    print "creating hash for $file....\n";    
        %filehash.$x = &filetohash($file);
        $x++;
    }
sub filetohash {
    my $counter;
    my $file = shift;
    my %filehash;
        open(FILE, "< $file") || die $!;
        while (<FILE>) {
            chomp $_; 
            s/^ *//; 
            s/ *$//; 
            $filehash{$_} = 1;
            $counter++;
            } 
        print "counted $counter lines for file $dir\n";
    return %filehash;
}
[download]

I am not able to get multiple hashes for multiple reads. Error is at the point where I am trying to make a unique Hash in the for loop.At point:

%filehash.$x = &filetohash($file);

I understand the syntax is wrong here. Help me if I can fix this or can you use an alternate way?

Thanks, Xhings

Thanks a lot guys for your valuable comments. Now that I have read the files (to be compared for uniqueness)into hashes and pushing the hashes into an array, I am trying to compare the uniqueness of the hashes. For that I have done the following:

use Data::Compare;
use Data::Dumper;

my $i = 0;
my $y = 0;
my $flag;
my $x = 1;
my @aoh;
my @dupAoh;

print "Tell me dude/dudette where are the log files: " , " \n";
$Location1 = <STDIN>;
chomp $Location1;
#print "And where would you like to put the processed results: " , "\n
+";
#$Location2 = <STDIN>;
#chomp $Location2;
$UserInputDir = $Location1;
opendir DH, $UserInputDir or die "Cannot open $UserInputDir: $!";
@files = readdir DH;
closedir DH;

foreach $file (@files) {
        next if $file=~/^\./;
    print "creating hash for $file....\n";    
        %filehash1 = filetohash($Location1.'\\'.$file);
        push @aoh, \%filehash1;
        #print Dumper(\\%filehash1);
                $x++;
    }
    print scalar(@aoh)."\n";
    @dupAoh = @aoh;
    #print Dumper(\@dupAoh);
    
foreach $val (@aoh) {
    #print "$val and $val++";
    %hashValue1 = pop(@dupAoh);
    for ($i = 0; $i < scalar(@dupAoh); ++$i){
    %hashValue2 = @aoh[i];
    #print Dumper(\\%hashValue1);
    #print Dumper(\\%hashValue2);
    Compare(\%hashValue1, \%hashValue2) ? "identical \n" : "not identi
+cal.\n";
    #compHash(\%hashValue1, \%hashValue2);
    }
}

sub compHash{    
my %filehash1 = shift;
my %filehash2 = shift;

    for $key (keys %filehash1) {
        if($filehash2{$key} == 1){$flag = 1;}else{$flag = 0; last;}   
+ 
            print "Flag value is $flag \n";
    }
($flag == 1) ? print "Yooohoo exactly same \n" : print "Not Same Alas!
+!!\n";
}

sub filetohash {
    my $counter;
    my $file = shift;
    my %filehash;
        open(FILE, "< $file") || die $!;
        while (<FILE>) {
            chomp $_; 
            s/^ *//;
            s/ *$//;
            $filehash{$_} = 1;
            $counter++;
            } 
        print "counted $counter lines for file $dir\n";
    return \%filehash;
}
[download]

I am trying with one of my own functions and have tried data::compare module as well, about which I didn't get much time to play with. I have copied the array of hashes into another array. From the duplicate array I am popping the hashes (in turn the size of array reducing with each pass) and comparing it with the second hash element onwards of the original array. Basically, for a give 10 hashes I want to compare 1st hash with subsequent 9 hashes and then in next pass 2nd hash with subsequent 8 hashes. Please let me know what wrong I am doing or my logic is skewed.

Thanks, Xhings

Comment on Unique Hash name for each pass Select or Download Code

Replies are listed 'Best First'.
Re: Unique Hash name for each pass by cdarke (Prior) on May 13, 2010 at 12:38 UTC
How about an array of hashes? `my @aoh; foreach $file (@files) { next if $file=~/^\./; print "creating hash for $file....\n"; my %filehash = filetohash($file); push @aoh,\%filehash; $x++; }` [download] It would be cleaner if you returned a reference from the filetohash subroutine instead: `sub filetohash { ... return \%filehash; } ... push @aoh,filetohash($file); ...` [download]	[reply] [d/l] [select]
Re^2: Unique Hash name for each pass by Xhings (Acolyte) on May 19, 2010 at 10:14 UTC
Thanks a lot guys for your valuable comments. Now that I have read the files (to be compared for uniqueness)into hashes and pushing the hashes into an array, I am trying to compare the uniqueness of the hashes. For that I have done the following: use Data::Compare; use Data::Dumper; my $i = 0; my $y = 0; my $flag; my $x = 1; my @aoh; my @dupAoh; print "Tell me dude/dudette where are the log files: " , " \n"; $Location1 = <STDIN>; chomp $Location1; #print "And where would you like to put the processed results: " , "\n +"; #$Location2 = <STDIN>; #chomp $Location2; $UserInputDir = $Location1; opendir DH, $UserInputDir or die "Cannot open $UserInputDir: $!"; @files = readdir DH; closedir DH; foreach $file (@files) { next if $file=~/^\./; print "creating hash for $file....\n"; %filehash1 = filetohash($Location1.'\\'.$file); push @aoh, \%filehash1; #print Dumper(\\%filehash1); $x++; } print scalar(@aoh)."\n"; @dupAoh = @aoh; #print Dumper(\@dupAoh); foreach $val (@aoh) { #print "$val and $val++"; %hashValue1 = pop(@dupAoh); for ($i = 0; $i < scalar(@dupAoh); ++$i){ %hashValue2 = @aoh[i]; #print Dumper(\\%hashValue1); #print Dumper(\\%hashValue2); Compare(\%hashValue1, \%hashValue2) ? "identical \n" : "not identi +cal.\n"; #compHash(\%hashValue1, \%hashValue2); } } sub compHash{ my %filehash1 = shift; my %filehash2 = shift; for $key (keys %filehash1) { if($filehash2{$key} == 1){$flag = 1;}else{$flag = 0; last;} + print "Flag value is $flag \n"; } ($flag == 1) ? print "Yooohoo exactly same \n" : print "Not Same Alas! +!!\n"; } sub filetohash { my $counter; my $file = shift; my %filehash; open(FILE, "< $file") \|\| die $!; while (<FILE>) { chomp $_; s/^ //; s/ $//; $filehash{$_} = 1; $counter++; } print "counted $counter lines for file $dir\n"; return \%filehash; } [download] I am trying with one of my own functions and have tried data::compare module as well, about which I didn't get much time to play with. I have copied the array of hashes into another array. From the duplicate array I am popping the hashes (in turn the size of array reducing with each pass) and comparing it with the second hash element onwards of the original array. Basically, for a give 10 hashes I want to compare 1st hash with subsequent 9 hashes and then in next pass 2nd hash with subsequent 8 hashes. Please let me know what wrong I am doing or my logic is skewed. Thanks, Xhings	[reply] [d/l]
Re: Unique Hash name for each pass by CountZero (Bishop) on May 13, 2010 at 14:00 UTC
And unless you have a very good reason to do otherwise, do not call a subroutine `my $result = &name_of sub(...)` [download] . Drop the `&` and just do `my $result = name_of sub(...)`. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply] [d/l] [select]
Re: Unique Hash name for each pass by wfsp (Abbot) on May 13, 2010 at 13:45 UTC
Just out of curiosity are you trying to do something like this `#!/usr/bin/perl use warnings; use strict; my @files = get_files(); my %single_hash; for my $file (@files){ open my $fh, q{<}, $file or die qq{open to read $file failed: $!\n}; my $file_as_string = do{local $/;<$fh>}; push @{$single_hash{$file_as_string}}, $file; } for my $value (values %single_hash){ print qq{@{$value}\n}; } sub get_files{ # stuff return qw{one two three}; }` [download] files one and two `one two three four` [download] file three `one two three four five` [download] output `three one two` [download] Two files (one and two) are the same i.e. not unique. Am I warm?	[reply] [d/l] [select]
Re: Unique Hash name for each pass by biohisham (Priest) on May 15, 2010 at 16:52 UTC
Since the aim is a file contents comparison, have you considered opting for a module like File::Compare for example? that has many functions and clear documentation as well.. Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.	[reply] [d/l]