I took a somewhat similar approach to the other reponders and used a Hash of Hashes (HoH) and combined the two data values for each entry into a string which could latter be split() so as to avoid the third level of deep data structure.
My approach (with just the principle code snippet) is shown below:
my %myHash = (); my %tempHash = (); foreach (@lines){ my($key1,$key2,$val1,$val2,$rest) = split(/\s+/,$_,5); my $combinedValue = sprintf("%2s,%2s",$val1,$val2); $key1 =~ /SNP(\d+)/; my $indx = $1; if(exists $myHash{$key2}){ %tempHash = %{$myHash{$key2}}; $tempHash{$indx} = $combinedValue; $myHash{$key2} = {%tempHash}; } else { $tempHash{$indx} = $combinedValue; $myHash{$key2} = {%tempHash}; } } foreach my $key (sort keys %myHash){ my %tempHash2 = %{$myHash{$key}}; my $line2output = "$key "; foreach my $sortedKey (sort keys %tempHash2){ $line2output .= sprintf(" %2s %2s",split(',',$tempHash2{$sortedKey})); } print "$line2output\n"; }
I have also put the OP's example data input into an array, @lines, to simplify my testing. Assuming that the lines are being read in from a file, one would do a foreach (<INPUT>){} rather than my foreach (@lines){} structure.
I hope this helps and shows yet another approach that works. I didn't spend a lot of time optimizing or simplifying. I figure that is a worthwhile exercise for the reader and the OP.
In reply to Re: Table manipulation, array or hash?
by ack
in thread Table manipulation, array or hash?
by robertkraus
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |