As the other responders noted, I also am not entirely sure what you're looking for. But following the general trend regarding what you're after, here is another strategy for handling what I think you're looking for.

#!/user/bin/perl use strict; use warnings; ############# Set up Input File ##################### # # Output will be to the terminal, in OP's code the # results would be output to the spreadsheet or to # a .csv file that a spreadsheet could read # my $dir = 'c:/Documents and Settings/xxxxxx/'; my $inFileName = 'MeshInputData.txt'; my $inFile = $dir . $inFileName; open(IN,"<",$inFile)||die "Can't open input file $inFileName: $!\n"; # open(OUT,">",$outFile)||die "Can't open output file $outFileName: $! +\n"; ##################################################### # # Read the file and process each line from input file # one at a time per OP's preference. # my %codes = (); my @diseases = (); my @meshes = (); foreach my $line (<IN>){ $i++; chomp($line); my($code,$codeIndex,$disease,$mesh) = $line =~ /^(.+),([0-9]+),(.+),(MESH:(?:[a-zA-Z0-9]+))$/; if(exists $codes{$code}){ @diseases = @{$codes{$code}[0]}; @meshes = @{$codes{$code}[1]}; push(@diseases,$disease); push(@meshes,$mesh); $codes{$code} = [[@diseases],[@meshes]]; } else { @diseases = ($disease); @meshes = ($mesh); $codes{$code} = [[@diseases],[@meshes]]; } } ############################################## # # Display the results to the terminal # $i = 0; foreach my $code (keys %codes){ $i++; my $diseases_ref = $codes{$code}[0]; my $meshes_ref = $codes{$code}[1]; my @diseases = @{$diseases_ref}; my @meshes = @{$meshes_ref}; my $diseases = join(",",@diseases); my $meshes = join(",",@meshes); print "$i: ($code),($diseases),($meshes)\n"; } exit(0);

The input file that I used to test the above code looks like this:

ARL6IP2,298757,Hyperalgesia,MESH:D006930 ARL6IP2,298757,Liver Diseases,MESH:D008107 ARL6IP2,298757,"Liver Failure, Acute",MESH:D017114 ARL6IP2,298757,Liver Neoplasms,MESH:D008113 CCL22,6367,Esophageal Neoplasms,MESH:D004938 CCL22,6367,Fatty Liver,MESH:D005234 CCL22,6367,Fetal Growth Retardation,MESH:D005317 CCL22,6367,Fever,MESH:D005334

With that input and the above code, the output to the terminal looks like the following:

1: (CCL22),(Esophageal Neoplasms,Fatty Liver,Fetal Growth Retardation, +Fever),(MESH:D004938,MESH:D005234,MESH:D005317,MESH:D005334) 2: (ARL6IP2),(Hyperalgesia,Liver Diseases,"Liver Failure, Acute",Liver + Neoplasms),(MESH:D006930,MESH:D008107,MESH:D017114,MESH:D008113)

As suggested by the other responders, this code uses a hash as the mechanism for storing the various codes (e.g., "ARL6IP2" or "CCL22"). Each entry in the has stores a reference to an array which, itself contains references to two other arrays. One of those two arrays is an array of what I designate as the @diseases (e.g., "Hyperalgesia" or "Liver Failure, Acute" associated with each $code) and the other is an array of what I designate as @meshes (i.e., the various "MESH:D006930" type of stuff associated with each $code).

The one perhaps odd looking construct in the script is:

$codes{$code} = [[@diseases],[@meshes]];

The use of the square brackets in the interior are to ensure that all of the entries in the hash don't point to the exact same array. The use of the square brackets in this way is the usual recommended way to ensure that one doesn't continually point to the same data structure.

I hope this helps show another way to do it.

ack Albuquerque, NM

In reply to Re: csv parsing by ack
in thread csv parsing by syedarshi16

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.