Without proper post formatting it is hard to see what you want, but at a guess your input looks like this? :

ARL6IP2,298757,Hyperalgesia,MESH:D006930 ARL6IP2,298757,Liver Diseases,MESH:D008107 ARL6IP2,298757,"Liver Failure, Acute",MESH:D017114 ARL6IP2,298757,Liver Neoplasms,MESH:D008113 CCL22,6367,Esophageal Neoplasms,MESH:D004938 CCL22,6367,Fatty Liver,MESH:D005234 CCL22,6367,Fetal Growth Retardation,MESH:D005317 CCL22,6367,Fever,MESH:D005334

In which case you need to build a hash based on your unique keys (gene names by the look of it). Hashes are perfect for dealing with data that is organised by unique identifiers.
I would probably go about it like this :


Maybe it would go something like this :

use warnings; use strict; ## for parsing the CSV use Text::CSV::Simple; ## for writing to excel use Spreadsheet::WriteExcel; my $datafile = ... the one you gave earlier ... # Only capture feilds of interest my $parser = Text::CSV::Simple->new; $parser->want_fields(1, 3, 4, ); my @data = $parser->read_file($datafile); ## data is now read in (if you file is really big, you can do this on- +the-fly too) my %hash = (); ## for storing collated data foreach @data{ ## store terms acording to unique id push @{ $hash{$_->[0]}->[0] }, $_->[1]; ## disease term stored in an + array push @{ $hash{$_->[0]}->[1] }, $_->[2]; ## MeSH term also stored in +an array } ## now print out data # Create a new workbook called simple.xls and add a worksheet my $workbook = Spreadsheet::WriteExcel->new('simple.xls'); my $worksheet = $workbook->add_worksheet(); # The general syntax is write($row, $column, $token). Note that row an +d # column are zero indexed my $row = 0; for my $key ( keys %hash ){ $worksheet->write($row, 0, $key); ## id $worksheet->write($row, 1, ( join ', ', @{ $hash{$key}->[0] } )); ## + diseases $worksheet->write($row, 2, ( join ', ', @{ $hash{$key}->[1] } )); ## + MeSH ++$row; ## move to the next row } print "$0 completed : ".(scalar(localtime))."\n";

I haven't tested this, but hopefully it will give you some ideas.

Just a something something...

In reply to Re: csv parsing by BioLion
in thread csv parsing by syedarshi16

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.