in reply to Re: Parsing csv without changing dimension of original file
in thread Parsing csv without changing dimension of original file
Thank you very much for your reply. Here is the sample of my original data "kegg_pathway_title.txt":
PVX_088085 Protein processing in endoplasmic reticulum PVX_114095 Protein processing in endoplasmic reticulum PVX_123055 Ribosome biogenesis in eukaryotes PYYM_1032000 - PYYM_1120600 - PCYB_031930 Purine metabolism; Metabolic pathways; DNA replication; + Pyrimidine metabolism
The orhtogroups_3.csv has 13 columns
Cparvum Bmicroti Tparva Pberghei Pchabaudi Pcynomolgi + Pfalciparum Pknowlesi Preichenowi Pvivax Pyoelii Pma +lariae Tgondii OG0000000 PBANKA_0000600, PBANKA_0000701, PBANKA_000080 +1, PBANKA_0001001, PBANKA_0001101, PBANKA_0001201, PBANKA_0001301, PB +ANKA_0001401, PBANKA_0001501, PBANKA_0006300, PBANKA_0006401, PBANKA_ +0006501, PBANKA_0006600, PBANKA_0006701, OG0000001 PmUG01_000101 +00.1-p1, PmUG01_00010200.1-p1, PmUG01_00010400.1-p1, PmUG01_00010500. +1-p1, PmUG01_00010600.1-p1, PmUG01_00010700.1-p1, PmUG01_00010800.1-p +1, PmUG01_00010900.1-p1, PmUG01_00011000.1-p1, PmUG01_00011300.1-p1, +PmUG01_00011400.1-p1, PmUG01_00011600.1-p1, PmUG01_00011700.1-p1, PmU +G01_00012100.1-p1, PmUG01_00012200.1-p1,
Expected output:
Cparvum Bmicroti Tparva Pberghei Pchabaudi Pcynomol +gi Pfalciparum Pknowlesi Preichenowi Pvivax Pyoelii + Pmalariae Tgondii OG0000000 - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , +- , - , - , - , - , - , - , - , - , - , + - , - , - , - , - , - , - , - , - , - +, - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , - + , - , - , - , - , - , - , - , - , - , +- , - , - , - , - , - , - , - , - , - , + - , - , - , - , - , - , - , - , - , - +, - , - OG0000024 - , - , - , - - , - , - - + , - , - - , - , - , - - , - , - + Protein processing in endoplasmic reticulum , - , - , - + , - - , - , - - , - , - - , - + , - - , - , - , - - , - , - - + , - , - , - - , - , - , - , - , - , +- , - , - , - , - , - , - , - , - OG0000025 - , - , - - , - , - , - - + , - , - , - - , - , - , - - , - , + - , - Protein processing in endoplasmic reticulum , Pro +tein processing in endoplasmic reticulum , - , Ribosome biogene +sis in eukaryotes - , - , - , - - , - , +- , - - , - , - , - - , Protein processi +ng in endoplasmic reticulum , Protein processing in endoplasmic re +ticulum , Ribosome biogenesis in eukaryotes - , - , - + , - - , - , - , - - , - , - , - + , - , - , - OG0000026
I want the column number (13) in orthogroups_3.csv and the parsed results to be same. Best regards Zillur
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Parsing csv without changing dimension of original file
by huck (Prior) on Mar 07, 2017 at 01:08 UTC | |
by zillur (Novice) on Mar 07, 2017 at 03:54 UTC |