Without proper post formatting it is hard to see what you want, but at a guess your input looks like this? :
ARL6IP2,298757,Hyperalgesia,MESH:D006930 ARL6IP2,298757,Liver Diseases,MESH:D008107 ARL6IP2,298757,"Liver Failure, Acute",MESH:D017114 ARL6IP2,298757,Liver Neoplasms,MESH:D008113 CCL22,6367,Esophageal Neoplasms,MESH:D004938 CCL22,6367,Fatty Liver,MESH:D005234 CCL22,6367,Fetal Growth Retardation,MESH:D005317 CCL22,6367,Fever,MESH:D005334
In which case you need to build a hash based on your unique keys (gene names by the look of it). Hashes are perfect for dealing with data that is organised by unique identifiers. I would probably go about it like this :
use warnings; use strict; ## for parsing the CSV use Text::CSV::Simple; ## for writing to excel use Spreadsheet::WriteExcel; my $datafile = ... the one you gave earlier ... # Only capture feilds of interest my $parser = Text::CSV::Simple->new; $parser->want_fields(1, 3, 4, ); my @data = $parser->read_file($datafile); ## data is now read in (if you file is really big, you can do this on- +the-fly too) my %hash = (); ## for storing collated data foreach @data{ ## store terms acording to unique id push @{ $hash{$_->[0]}->[0] }, $_->[1]; ## disease term stored in an + array push @{ $hash{$_->[0]}->[1] }, $_->[2]; ## MeSH term also stored in +an array } ## now print out data # Create a new workbook called simple.xls and add a worksheet my $workbook = Spreadsheet::WriteExcel->new('simple.xls'); my $worksheet = $workbook->add_worksheet(); # The general syntax is write($row, $column, $token). Note that row an +d # column are zero indexed my $row = 0; for my $key ( keys %hash ){ $worksheet->write($row, 0, $key); ## id $worksheet->write($row, 1, ( join ', ', @{ $hash{$key}->[0] } )); ## + diseases $worksheet->write($row, 2, ( join ', ', @{ $hash{$key}->[1] } )); ## + MeSH ++$row; ## move to the next row } print "$0 completed : ".(scalar(localtime))."\n";
I haven't tested this, but hopefully it will give you some ideas.
In reply to Re: csv parsing
by BioLion
in thread csv parsing
by syedarshi16
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |