in reply to Matching hash keys from different hashes and utilizing in new hash
G'day FIJI42,
Welcome to the Monastery.
Unfortunately, ">50 columns and rows" is somewhat vague: ">50 (columns plus rows in total)"? ">50 columns and >50 rows"? something else?; also, 51 and 51,000,000 satisfy >50. In addition, providing an actual Perl data structure, for wanted or expected results, gives a much clearer picture of what you are trying to achieve.
That said, here's the technique I might have used:
#!/usr/bin/env perl use strict; use warnings; use autodie; use Text::CSV; use Data::Dump; die "Usage: $0 file1 file2" unless @ARGV == 2; my ($file1, $file2) = @ARGV; my $csv = Text::CSV::->new({sep_char => "\t"}); my $gene_data_1 = get_gene_data($file1, $csv); my $gene_data_2 = get_gene_data($file2, $csv); my %gene_common; for (keys %$gene_data_1) { next unless exists $gene_data_2->{$_}; push @{$gene_common{$_}}, $gene_data_1->{$_}, $gene_data_2->{$_}; } dd $gene_data_1; dd $gene_data_2; dd \%gene_common; sub get_gene_data { my ($file, $csv) = @_; my %data; open my $fh, '<', $file; my $header = $csv->getline($fh); my @cols = @$header[1 .. $#$header]; while (my $row = $csv->getline($fh)) { @{$data{$row->[0]}}{@cols} = @$row[1 .. $#$row]; } return \%data; }
Which outputs:
{ Gene01 => { ColA => 5, ColB => 15 }, Gene02 => { ColA => 4, ColB => 8 }, Gene03 => { ColA => 25, ColB => 5 }, } { Gene01 => { ColA => 12, ColC => 3 }, Gene03 => { ColA => 5, ColC => 20 }, Gene05 => { ColA => 22, ColC => 40 }, Gene06 => { ColA => 88, ColC => 2 }, } { Gene01 => [{ ColA => 5, ColB => 15 }, { ColA => 12, ColC => 3 }], Gene03 => [{ ColA => 25, ColB => 5 }, { ColA => 5, ColC => 20 }], }
Notes (bearing in mind your "New to Perl" comment):
— Ken
|
|---|