Thank you very much for your reply. Here is the sample of my original data "kegg_pathway_title.txt":

PVX_088085 Protein processing in endoplasmic reticulum PVX_114095 Protein processing in endoplasmic reticulum PVX_123055 Ribosome biogenesis in eukaryotes PYYM_1032000 - PYYM_1120600 - PCYB_031930 Purine metabolism; Metabolic pathways; DNA replication; + Pyrimidine metabolism

The orhtogroups_3.csv has 13 columns

Cparvum Bmicroti Tparva Pberghei Pchabaudi Pcynomolgi + Pfalciparum Pknowlesi Preichenowi Pvivax Pyoelii Pma +lariae Tgondii OG0000000 PBANKA_0000600, PBANKA_0000701, PBANKA_000080 +1, PBANKA_0001001, PBANKA_0001101, PBANKA_0001201, PBANKA_0001301, PB +ANKA_0001401, PBANKA_0001501, PBANKA_0006300, PBANKA_0006401, PBANKA_ +0006501, PBANKA_0006600, PBANKA_0006701, OG0000001 PmUG01_000101 +00.1-p1, PmUG01_00010200.1-p1, PmUG01_00010400.1-p1, PmUG01_00010500. +1-p1, PmUG01_00010600.1-p1, PmUG01_00010700.1-p1, PmUG01_00010800.1-p +1, PmUG01_00010900.1-p1, PmUG01_00011000.1-p1, PmUG01_00011300.1-p1, +PmUG01_00011400.1-p1, PmUG01_00011600.1-p1, PmUG01_00011700.1-p1, PmU +G01_00012100.1-p1, PmUG01_00012200.1-p1,

Expected output:

Cparvum Bmicroti Tparva Pberghei Pchabaudi Pcynomol +gi Pfalciparum Pknowlesi Preichenowi Pvivax Pyoelii + Pmalariae Tgondii OG0000000 - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , +- , - , - , - , - , - , - , - , - , - , + - , - , - , - , - , - , - , - , - , - +, - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , +- , - , - , - , - , - , - , - , - , - , + - , - , - , - , - , - , - , - , - , - +, - , - OG0000024 - , - , - , - - , - , - - + , - , - - , - , - , - - , - , - + Protein processing in endoplasmic reticulum , - , - , - + , - - , - , - - , - , - - , - + , - - , - , - , - - , - , - - + , - , - , - - , - , - , - , - , - , +- , - , - , - , - , - , - , - , - OG0000025 - , - , - - , - , - , - - + , - , - , - - , - , - , - - , - , + - , - Protein processing in endoplasmic reticulum , Pro +tein processing in endoplasmic reticulum , - , Ribosome biogene +sis in eukaryotes - , - , - , - - , - , +- , - - , - , - , - - , Protein processi +ng in endoplasmic reticulum , Protein processing in endoplasmic re +ticulum , Ribosome biogenesis in eukaryotes - , - , - + , - - , - , - , - - , - , - , - + , - , - , - OG0000026

I want the column number (13) in orthogroups_3.csv and the parsed results to be same. Best regards Zillur


In reply to Re^2: Parsing csv without changing dimension of original file by zillur
in thread Parsing csv without changing dimension of original file by zillur

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.