in reply to Adding missing values into a hash

Try
#!perl use strict; use Text::CSV; my %info=(); my $line_count=0; while (my $line = <DATA>){ chomp($line); if ($line =~ /##INFO=<ID=([^,]+)/){ $info{$1}=[]; } else { my (undef,%hash) = split /[\t;=]/,$line; for (keys %info){ push @{$info{$_}},$hash{$_} || 'NA'; } ++$line_count; } } my $csv = Text::CSV->new ( {binary=>1, eol=>"\012"} ) or die "Cannot use CSV: ".Text::CSV->error_diag(); open my $fh,'>','output.csv' or die "Could not open output.csv $!"; my @col_head = sort keys %info; $csv->print($fh, \@col_head); for my $i (1..$line_count){ my @row = map { $info{$_}[$i-1] } @col_head; $csv->print($fh, \@row); } __DATA__ ##INFO=<ID=AA, ##INFO=<ID=AB, ##INFO=<ID=AC, 1 AA=1;AB=2;AC=3 2 AA=2;AB=2 3 AA=5;AB=1;AC=1
poj

Replies are listed 'Best First'.
Re^2: Adding missing values into a hash
by Biopolete (Initiate) on Jun 19, 2014 at 14:32 UTC

    Thank you very much for your answer :)

    Your answer seems quite interesting but I don't know why but I obtain too many "NA".

    With de first part of the script I obtain for example

    AA=> NA, NA,1,2,5,NA,NA,NA,NA,NA

    instead of

    AA=> 1,2,5

    Perhaps is related with

    my (undef,%hash) = split /[\t;=]/,$line;

    because you are spliting 3 times, I don't know.

    The final csv is

    AA => NA,NA,NA

    AB => NA,NA,NA

    AC => NA,NA,NA

    Perhaps is because the problem with de "NA".

      Do you have other lines in the file apart from those like
      ##INFO=<ID=AA, and 1 AA=1;AB=2;AC=3 ? Blank lines for example.

      poj

        No, I don't have any other lines in the file :(

        I have the same problem if I use your complete script with:

        __DATA__

        ##INFO=<ID=AA,

        ##INFO=<ID=AB,

        ##INFO=<ID=AC,

        1 AA=1;AB=2;AC=3

        2 AA=2;AB=2

        3 AA=5;AB=1;AC=1

        I only obtain "NA" values in CVS file