in reply to Re: Perl script help to convert .txt file to .csv
in thread Perl script help to convert .txt file to .csv
Using your example, I found out there is no need for the header part at all. I removed that part of the code, and then used the s//g; to get rid of the rest of the field names.
Here is the code:Here is the output:use strict; use warnings; print "\n Running script for Jiggs \n"; my $infile = "foot.txt"; open my $in, "<", $infile or die $!; open my $out, ">", "foot1.txt" or die $!; while (<$in>) { s/\s+\Z/\n/; s/ +/,/g; s/,length=/,/g; s/,xy=/,/g; s/,region=/,/g; s/,run=/,/g; print $out $_; } close $in; close $out; print "\n Done!\n";
>G9JVYGV01AJE8V,135,0104_0349,1,R_2011_09_20_15_00_06_ GGTGGTAGTGAAGAAGAGGAGATGAAAGTGGAAGAGGTTGAGGATGAGAAGGTTGAATTG GAAGAAGAAGATGAGAAGGTTGAAGTGGAAGATGAGAAGGTTGAAGTGGAAGAAGATGAA GTGGAAGAGAGGAGC >G9JVYGV01A4910,90,0353_0150,1,R_2011_09_20_15_00_06_ GGTGCATGGCATTGTAGATGGTTGCTTGATAGTTGCCCATACGTGTACTACACTTGCAGA GTGAAGCAACCATCTACAATGCCATGCACC >G9JVYGV01A0SVP,70,0302_0163,1,R_2011_09_20_15_00_06_ GCACCATTCAGCACAGATATAGTAGCCACATCAACACAAGTTACCTAACTATATCTGTGC TGAATGGTGC >G9JVYGV01A221U,89,0328_0160,1,R_2011_09_20_15_00_06_ CTGGACATTTACATCCATAAGTAGGAGTTAGGACTCTGCACCAGCCTCTTGAGCTTGTGA CGTCTCTTCTCCTCCTCCGGACTGGGACA >G9JVYGV01BVCPK,46,0650_0134,1,R_2011_09_20_15_00_06_ GCAAGATCGCAAGCCAAGCAACGTTTCACGAACTGGCCAGAATGAG >G9JVYGV01AOU3I,81,0166_0220,1,R_2011_09_20_15_00_06_ TCATTGACATCTGTGCAGCTGCAGGAGCGGATATGAGGAGATGGTTCTATCTGCACAGAT GTCAATGAGTGTGACAGTGAT >G9JVYGV01A0JEL,61,0299_0171,1,R_2011_09_20_15_00_06_ CGAGTGAAGGCATTGGTGATGCTGGTGTGAAGAGTGAGGGCATCGCCAATGCCTTCACTC G >G9JVYGV01AUKIG,119,0231_0198,1,R_2011_09_20_15_00_06_ GGCCACCAGGGCTTAACTTCCTGTGCCTCACCATCACGCAGTTGTCAGAGGATCCACATT GAACAAAGTAGCAATTCTTTCCACTCTGTGACACACCAACATTCTTATACAGCACCAGG >G9JVYGV01AJ8F7,29,0113_1333,1,R_2011_09_20_15_00_06_ CTGCTTCCAAGCCTCCAACCTCTAACCAG >G9JVYGV01AMQ87,79,0142_0233,1,R_2011_09_20_15_00_06_ AGAGTCTCCTCATTGTTCTTTCCAAGTCCTCTATTGCTGAGCCTGGTTTCGTACCTTCTC AGCTAGGCCCTCTTTCTCT >G9JVYGV01A4W45,85,0348_3895,1,R_2011_09_20_15_00_06_ GCTTCACATCTCAGAAATATAACCGCTAATGATCTGAAACAAGTTACAATCTGACATTCT GAAACCAAATGAAAGCAGCATAAAC >G9JVYGV01A7TPA,66,0382_0140,1,R_2011_09_20_15_00_06_ ATGGCTTACCTCACTGTCGATGGAGATCGAATGCAAGCGATGTCCATCGACAGTGAGGTA AGCCAT
Almost there, but I need to put a comma after the last field before the sequence. Then remove the returns within the sequence.
If each entry starts with a carot, then I need 6 fields seperated by commas and then the new line: the opening entry, the length=, xy=, region=, run=, AAGGTTGGCC /n).
Apologies for not being clearer, but thanks for the help so far.
|
|---|