I seem to be struggling with something basic here... I have 2 input files:
CTACTCTCTCGTTTCCTAGGCTC -1.06 CTACTCTCTCGTTTCCTAGGCTC -1.06 CTACTCTCTCGTTTCCTAGGCTC -1.06 CATGGTCTCATCTTCCTAGGGAG -2.32 CATGGTCTCATCTTCCTAGGGAG -2.32 TAAGGCAGCCCACCCGCAGGCTG -15.60 AATAAGAGTAAGGACTTACTCTT -30.64 TTTCCCTTTCCCCTGCCAGATCT -11.24 TCTATCCTTTGTTTTACAGGAAC -3.05 ACTGTGTATAAATACTTACATCC -16.93 CGGTCCAGGCGTCGGCTACCTGG -22.77 CGGTCCAGGCGTCGGCTACCTGG -22.77 CAGGTACGTATTTTTCCAGGAAG -7.75 CCTGGGAAGAATGTCCTACCTGA -22.07 TTTCTCTTTCTTCAAACAGATGA -13.04 CCCCTTTCAAGTGACTCACAAGA -22.38 AGTGTCCTAGACGAAACACGTGA -17.22 CCACAATCTGATCACATACCTGA -19.09 GTGAGTGTCAGAGCCCTGTGGGC -31.44 GGTGACCTTTAAGGGCAAAATGT -17.26 GGTGACCTTTAAGGGCAAAATGT -17.26 GGTGACCTTTAAGGGCAAAATGT -17.26 TCGCCAAGGTCAGTGGCACAACT -31.06
and this is the second one
CTACTCTCTCATTTCCTAGGCTC -1.82 CTACTCTCTCATTTCCTAGGCTC -1.82 CTACTCTCTCATTTCCTAGGCTC -1.82 CATGGTCTCGTCTTCCTAGGGAG -1.06 CATGGTCTCGTCTTCCTAGGGAG -1.06 TAAGGCAGCTCACCCGCAGGCTG -13.51 AATAAGAGTAAGGTCTTACTCTT -28.26 TTTCCCTTTCCCTTGCCAGATCT -11.36 TCTATCCTTTGCTTTACAGGAAC -3.27 ACTGTGTATAAATGCTTACATCC -17.69 CGGTCCAGGCGGCGGCTACCTGG -25.61 CGGTCCAGGCGGCGGCTACCTGG -25.61 CAGGTACGTGTTTTTCCAGGAAG -6.62 CCTGGGAAGAATGTTCTACCTGA -22.07 TTTCTCTTTCTCCAAACAGATGA -13.05 CCCCTTTCATGTGACTCACAAGA -16.03 AGTGTCCTAGACAAAACACGTGA -16.88 CCACAATCTGAGCACATACCTGA -26.65 GTGAGTGTCGGAGCCCTGTGGGC -26.06 GGTGACCTTTAAAGGCAAAATGT -17.87 GGTGACCTTTAAAGGCAAAATGT -17.87 GGTGACCTTTAAAGGCAAAATGT -17.87 TCGCCAAGGTCAGTAGCACAACT -35.97
The sequences are the same in both files, just the number varies. I want to be able to output:
[whatever sequence is] [difference between 2 values].
It's obvious to me to use a hash, but when I try and do the calculation in my script it falls down. I tried making some simpler dummy data and the @calc array worked, so I'm not sure where I'm going wrong. My code is here:
#!/usr/bin/perl -w use strict; my $file1 = $ARGV[0]; my $file2 = $ARGV[1]; open (FILE1, $file1) or die "Uh oh.. unable to find file $file1"; ##Op +ens input file open (FILE2, $file2) or die "Unable to find $file2"; my @maxent_unchanged = <FILE1>; #loads inputfile1 data into array close FILE1; my @maxent_with_variant = <FILE2>; ## loads ref genome close FILE2; my @NM; my @max_score_unchanged; my %max_unchanged; foreach my $line(@maxent_unchanged) { if ($line =~ m/[a-z]/i) { push (@NM, $line); } else { push (@max_score_unchanged, $line); } } my $i = 0; foreach my $lines(@maxent_unchanged) { $max_unchanged{$NM[$i]} = $max_score_unchanged[$i]; $i++; } my @NM_ID; my @max_score_changed; my %max_changed; foreach my $line(@maxent_with_variant) { if ($line =~ m/[a-z]/i) { push (@NM_ID, $line); } else { push (@max_score_changed, $line); } } my $i = 0; foreach my $lines(@maxent_with_variant) { $max_changed{$NM_ID[$i]} = $max_score_changed[$i]; $i++; } print %max_unchanged; print "\n"; print "\n"; print "\n"; print %max_changed; my @calc; foreach my $key (keys(%max_changed)) { my $value1 = $max_unchanged{$key}; my $value2 = $max_changed{$key}; my $calc = $value1 - $value2; push (@calc, $calc); } use Data::Dumper; print Dumper @calc;
my other script which works is here:
#!/usr/bin/perl -w use strict; my %hash; my %hash2; %hash = ('John', '-455.45', 'Jack', '-300.00', 'Tom', '-766.75'); %hash2 = ('Jack', '-200.00', 'John', '-44.25', 'Tom', '-999.23'); use Data::Dumper; print Dumper %hash2; print "\n"; use Data::Dumper; print Dumper %hash; my @calc; foreach my $key (keys(%hash2)) { my $value1 = $hash{$key}; my $value2 = $hash2{$key}; my $calc = $value1 - $value2; push (@calc, $calc); } print "The difference is:", "\n"; use Data::Dumper; print Dumper @calc;
When I Data::Dump the arrays, I can see that they hold the correct information, so I don't understand why I can't get what I want... Go easy on me.. I'm still a beginner...
In reply to subtracting values from 2 hashes by lecb
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |