Dear Monks,
I have a file in the following format:
ENSG00000088992 TESC 1105894 Prot_Ente 0.31 0.038 ENSG00000105374 NKG7 1105894 Prot_Ente 0.37 0.01 ENSG00000005810 MYCBP2 4322986 Bact_Bact 0.29 0.044 ENSG00000088992 TESC 4322986 Bact_Bact 0.27 0.044 ENSG00000109016 DHRS7B 4322986 Bact_Bact -0.37 0.008 ENSG00000069248 NUP133 364926 Bact_Bact 0.32 0.024 ENSG00000005810 MYCBP2 363400 Firm_Lach -0.29 0.036 ENSG00000105374 NKG7 363400 Firm_Lach -0.27 0.047 ENSG00000105374 NKG7 364736 Firm_Lach -0.27 0.039 ENSG00000105374 NKG7 186735 Firm_Lach -0.30 0.037 ENSG00000133107 TRPC4 4322986 Bact_Bact 0.35 0.01
From this table I want to create a matrix where 1st column becomes the row names and 4th column becomes the column names. The values in the 5th column fill in the matrix. Something like this:
Gene Prot_Ente Bact_Bact Firm_Lach ENSG00000088992 0.31 ENSG00000105374 0.37 ENSG00000005810 0.29 ENSG00000088992 0.27 ENSG00000109016 -0.37 ENSG00000069248 0.32 ENSG00000005810 -0.29 ENSG00000105374 -0.27 ENSG00000105374 -0.27 ENSG00000105374 -0.30 ENSG00000133107 0.35
Following is the code that I am using and it is not working correctly. Not all the values of the 5th column are printed because of the redundant keys in the nested hashes that I am using.
$file=$ARGV[0]; open(FH,$file); open OUT1,">./$file\_rho_temp"; while(<FH>){ chomp; next if($_=~/^Gene/); @arr=split(/\s+/,$_); $rhash{$arr[0]}{$arr[3]}=$arr[4]; } @keys = keys %{$rhash{(keys %rhash)[0]}}; $format = "%1s " . ("%2s " x @keys) . "\n"; printf OUT1 $format, "Genes", @keys; foreach $key (keys %rhash) { printf OUT1 $format, $key, @{$rhash{$key}}{@keys}; }
Can someone suggest some modification in this code or another method to make this work? Thanks a lot in advance.
In reply to creating a matrix like format by perl_user123
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |