in reply to transforming a table

Given that you have two sets of non-numeric keys, I would think a hash of hashes is a more logical structure than a hash of arrays. I would also, as Athanasius suggests, separate your data structure construction from your output process. Essentially, build a hash of hashes, create sorted lists of your genes and transcription factors, and then print the table:
#!/usr/bin/perl use strict; use warnings; use List::MoreUtils qw(uniq); use 5.10.0; my %gene2TF2val; while (<DATA>) { my ($gene, $tf) = split; $gene2TF2val{$gene}{$tf}++; } # ID distinct transcription factors, sorted by value my @tf = sort { ($a =~ /(\d+)/)[0] <=> ($b =~ /(\d+)/)[0]} uniq map keys %$_, values %gene2TF2val; # Print table header say join "\t", "NAMES", @tf; # Print table contents for my $gene (sort keys %gene2TF2val) { say join "\t", $gene, map $_ ? '+' : '-', @{$gene2TF2val{$gene}}{@tf}; } __DATA__ geneA T1 geneA T1 geneA T2 geneB T8 geneC T10 geneC T1
Note that the command line switch -w is (mostly) synonymous with use warnings. If you are not yet comfortable with map and Slices, you can store intermediate results in arrays:
#!/usr/bin/perl use strict; use warnings; use List::MoreUtils qw(uniq); use 5.10.0; my %gene2TF2val; while (<DATA>) { my ($gene, $tf) = split; $gene2TF2val{$gene}{$tf}++; } # ID distinct transcription factors, sorted by value my @tf; for my $tf (values %gene2TF2val) { push @tf, keys %$tf; } @tf = sort { ($a =~ /(\d+)/)[0] <=> ($b =~ /(\d+)/)[0]} uniq @tf; # Print table header say join "\t", "NAMES", @tf; # Print table contents for my $gene (sort keys %gene2TF2val) { print $gene; for my $tf (@tf) { my $has = $gene2TF2val{$gene}{$tf} ? '+' : '-'; print "\t$has"; } print "\n"; } __DATA__ geneA T1 geneA T1 geneA T2 geneB T8 geneC T10 geneC T1

#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.