How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)?

aquinom has asked for the wisdom of the Perl Monks concerning the following question:

Hey Monks, I'm trying to read in a file and create a hash for all the key/value pairs (summed, as in the code), then take in a second file that uses the same hash keys but divides them by the summed values. I'm not sure if the way to do this is to create 2 hashes or stick to 1, but let's assume my first input file looks like

0201    3201    1.00
0201    2608    1.00
0201    2402    0.94
0201    0302    1.00
2402    2402    0.99
0101    0201    0.99
0201    1101    1.00
0301    2601    1.00
0301    1101    0.98
2601    0301    1.00
0301    2601    1.00
0301    2601    1.00
0301    2601    1.00
[download]

and my second input file looks like

0201    3201    2.00
0201    2608    2.00
0201    2402    1.94
0201    0302    2.00
2402    2402    1.99
0101    0201    1.99
0201    1101    2.00
0301    2601    2.00
0301    1101    1.98
2601    0301    2.00
0301    2601    2.00
0301    2601    2.00
0301    2601    2.00
[download]

in the output I would expect the value for the row/column corresponding to A0301/A2601 to equal 0.5 (5/10) but in my code I can't even get the value to divide by the second input file values ( I just get 5 ) Not sure what to do, can anyone help me fix this?

#!/usr/bin/perl
use strict;
use warnings;

my $infile = $ARGV[0];
my $infile2 = $ARGV[1];

unless (open(INFILE, $infile)){
       die "Couldn't open infile: $!\n";
}


my @AtypeData = qw(A0101 A0102 A0201 A0202 A0205 A0301 A0302 A1101 A23
+01 A2402 A2403 A2601 A2608 A2902 A3001 A3002 A3004 A3101 A3201 A3601 
+A6801 A6802);


my %diplotypes;
my %diplotypes2;

initHash(\%diplotypes, \@AtypeData);
initHash(\%diplotypes2, \@AtypeData);

##read in the data
while (<INFILE>){
    chomp;
    my @line = split ('\t', $_);
    my $key1 = 'A' . $line[0] . '.' . 'A' . $line[1]; ##first key
    my $key2 = 'A' . $line[1] . '.' . 'A' . $line[0]; ##key the other 
+way
    ##check to see if the key exists in the hash
     ##if it doesn't there is data in your infile, not in you names ar
+ray
    if (exists $diplotypes{$key1} && $line[0] <= $line[1]) {
        $diplotypes{$key1} += $line[2];
        }
    elsif (exists $diplotypes{$key2} && $line[0] >= $line[1]) {
        $diplotypes{$key2} += $line[2];
        }
    else{##world is out to get you
        print STDERR "No key for $key1 or $key2\n";
        next;
    }

}

close INFILE;

unless (open(INFILE2, $infile2)){
       die "Couldn't open infile: $!\n";
}
while (<INFILE2>){
    chomp;
    my @line = split ('\t', $_);
    my $key1 = 'A' . $line[0] . '.' . 'A' . $line[1]; ##first key
    my $key2 = 'A' . $line[1] . '.' . 'A' . $line[0]; ##key the other 
+way
    ##check to see if the key exists in the hash
     ##if it doesn't there is data in your infile, not in you names ar
+ray
    if (exists $diplotypes2{$key1} && $line[0] <= $line[1]) {
        $diplotypes2{$key1} += $line[2];
        }
    elsif (exists $diplotypes2{$key2} && $line[0] >= $line[1]) {
        $diplotypes2{$key2} += $line[2];
        }
    else{##world is out to get you
        print STDERR "No key for $key1 or $key2\n";
        next;
    }

}
foreach my $key1(keys %diplotypes){
        if (exists $diplotypes2{$key1}){
            $diplotypes{$key1} /= $diplotypes2{$key1} +0.01;
            }
        }




close INFILE2;

printData(\%diplotypes, \@AtypeData);

sub initHash {  #init the all to all hash
    ##first argument is the hash of data, and the second is a referenc
+e to all the columns
    my ($refHash, $refArr) = @_;
    foreach my $ele1(@$refArr){
        foreach my $ele2(@$refArr){
            my $key = $ele1 . "." . $ele2;

            if (exists $$refHash{$key}){
                print STDERR "This key existed in your array of names,
+ skipping\n";
                next;
            }
            else{
                $$refHash{$key} = 0;
            }
        }
    }
}
sub printData {
    
    my ($refHash, $refArr) = @_;
    #print header line;
    print "MATRIX\t";
    foreach my $ele(@$refArr){
        print "$ele", "\t";
    }
    print "\n";
    #print out the actual data    
    foreach my $ele1(@$refArr){
        print "$ele1" , "\t";##print out the first value on the row, w
+hich is the name
        foreach my $ele2(@$refArr){
            my $key = $ele1 . "." . $ele2;
            if (exists $$refHash{$key}){
                printf "%.2f \t", $$refHash{$key};
            }
            else{
                print STDERR "Something is wrong\n";
            }
        }
        print "\n";
    }

    
}
[download]

Comment on How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)? Select or Download Code

Replies are listed 'Best First'.
Re: How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)? by zek152 (Pilgrim) on Jun 06, 2011 at 18:05 UTC
One immediate problem is that you are only reading 1 file `#your code #my $infile = $ARGV[0]; #my $infile2 = $ARGV[0]; <--- $infile == $infile2 #corrected code my $infile = $ARGV[0]; my $infile2 = $ARGV[1];` [download] Update: found another issue In the following code block you never initialize the value to 0 if the key does not exist. #yourcode while (<INFILE>){ chomp; my @line = split ('\t', $_); my $key1 = 'A' . $line[0] . '.' . 'A' . $line[1]; ##first key my $key2 = 'A' . $line[1] . '.' . 'A' . $line[0]; ##key the other +way ##check to see if the key exists in the hash ##if it doesn't there is data in your infile, not in you names ar +ray if (exists $diplotypes{$key1} && $line[0] <= $line[1]) { $diplotypes{$key1} += $line[2]; } elsif (exists $diplotypes{$key2} && $line[0] >= $line[1]) { $diplotypes{$key2} += $line[2]; } else{##world is out to get you print STDERR "No key for $key1 or $key2\n"; next; } } [download] Something along the lines of the following might help your issue. while (<INFILE>){ chomp; my @line = split ('\t', $_); my $key1 = 'A' . $line[0] . '.' . 'A' . $line[1]; ##first key my $key2 = 'A' . $line[1] . '.' . 'A' . $line[0]; ##key the other +way ##check to see if the key exists in the hash ##if it doesn't there is data in your infile, not in you names arr +ay ##new logic if($line[0] <= $line[1]) { if(exists $diplotypes{$key1}) { $diplotypes{$key1} += $line[2]; } else { #key doesnt exist so add it $diplotypes{$key1} = $line[2]; } } else { if(exists $diplotypes{$key2}) { $diplotypes{$key2} += $line[2]; } else { #key doesnt exist so add it $diplotypes{$key2} = $line[2]; } } }} [download] 2nd Update: I did not notice that you had an initHashes function. That makes my 1st update unnecessary however what I posted is a more compact way of acheiving the same result. Sorry for the confusion. Hope this helps.	[reply] [d/l] [select]
Re^2: How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)? by aquinom (Sexton) on Jun 06, 2011 at 18:18 UTC
I seem to be getting the expected output now, after fixing that silly typo in the $infile2 declaration	[reply]
Re^2: How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)? by aquinom (Sexton) on Jun 06, 2011 at 18:20 UTC
The initHash subroutine initializes all possible keys and sets the values to 0 to begin with though, I understand your change but I don't see how it's necessary? `sub initHash { #init the all to all hash ##first argument is the hash of data, and the second is a referenc +e to all the columns my ($refHash, $refArr) = @_; foreach my $ele1(@$refArr){ foreach my $ele2(@$refArr){ my $key = $ele1 . "." . $ele2; if (exists $$refHash{$key}){ print STDERR "This key existed in your array of names, + skipping\n"; next; } else{ $$refHash{$key} = 0; } } } }` [download]	[reply] [d/l]
Re: How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)? by toolic (Bishop) on Jun 06, 2011 at 18:06 UTC
The code you posted does not compile for me. I get several of these errors: `Global symbol "%diplotypes2" requires explicit package name` [download] Download your own code to make sure what you posted is what you are running.	[reply] [d/l]
Re^2: How to divide the value of a hash key by the value of another hash key (when the keys are equivalent)? by aquinom (Sexton) on Jun 06, 2011 at 18:13 UTC
Hey, I think the previous poster saw what I failed to see.... but I'll repost the code anyways, it should run.	[reply]