Hi
I was wondering about your answer and therefore made this selfcontained snippet which should show the relevant elements.
#!/bin/env perl
use strict;
use warnings;
use 5.010;
my %info;
while (my $line = <DATA>) {
chomp $line;
if ($line =~ /##INFO=<ID=/) {
my ($first, $second) = split /,/, $line;
my ($firstsecond, $secondsecond) = split /ID=/, $first;
$info{$secondsecond}=();
}
elsif ($line !~ /#/) {
my ($numbers, $data) = split /\s+/, $line;
foreach my $dat ($data){
my @elements = split /;/, $data;
my %rowvalues;
foreach my $element (@elements) {
my ($key, $value) = split /=/, $element;
$rowvalues{$key} = $value;
}
foreach my $key (keys %info) {
if(exists $rowvalues{$key}) {
push @{$info{$key}}, $rowvalues{$key};
}
else {
push @{$info{$key}}, 'NA';
}
}
}
}
else {
next;
}
}
foreach my $header (sort keys %info) {
say $header, ' => ', join(',', @{$info{$header}});
}
__DATA__
# First the headers
##INFO=<ID=AA,
##INFO=<ID=AB,
##INFO=<ID=AC,
# then the data
1 AA=1;AB=2;AC=3
2 AA=2;AB=2
3 AA=5;AB=1;AC=1
I hope this will clarify what was said before. I change one split from '\t' to '\s+' because of pasting this code herein would probably destroy tghe tab character.
Regards
McA |