Seabass has asked for the wisdom of the Perl Monks concerning the following question:
Hello friends! Newbie to Perl, but quickly learning.
I need help with a script that will read a text file, surround fields by commas, then print it to another file.
The purpose is to prepare raw pyro-sequencing files for upload into a SQL database.
Example of the infile:>G9JVYGV01AJE8V length=135 xy=0104_0349 region=1 run=R_2011_09_20_15_0 +0_06_ GGTGGTAGTGAAGAAGAGGAGATGAAAGTGGAAGAGGTTGAGGATGAGAAGGTTGAATTG GAAGAAGAAGATGAGAAGGTTGAAGTGGAAGATGAGAAGGTTGAAGTGGAAGAAGATGAA GTGGAAGAGAGGAGC >G9JVYGV01A4910 length=90 xy=0353_0150 region=1 run=R_2011_09_20_15_00 +_06_ GGTGCATGGCATTGTAGATGGTTGCTTGATAGTTGCCCATACGTGTACTACACTTGCAGA GTGAAGCAACCATCTACAATGCCATGCACC >G9JVYGV01A0SVP length=70 xy=0302_0163 region=1 run=R_2011_09_20_15_00 +_06_ GCACCATTCAGCACAGATATAGTAGCCACATCAACACAAGTTACCTAACTATATCTGTGC TGAATGGTGC >G9JVYGV01A221U length=89 xy=0328_0160 region=1 run=R_2011_09_20_15_00 +_06_ CTGGACATTTACATCCATAAGTAGGAGTTAGGACTCTGCACCAGCCTCTTGAGCTTGTGA CGTCTCTTCTCCTCCTCCGGACTGGGACA >G9JVYGV01BVCPK length=46 xy=0650_0134 region=1 run=R_2011_09_20_15_00 +_06_ GCAAGATCGCAAGCCAAGCAACGTTTCACGAACTGGCCAGAATGAG >G9JVYGV01AOU3I length=81 xy=0166_0220 region=1 run=R_2011_09_20_15_00 +_06_ TCATTGACATCTGTGCAGCTGCAGGAGCGGATATGAGGAGATGGTTCTATCTGCACAGAT GTCAATGAGTGTGACAGTGAT >G9JVYGV01A0JEL length=61 xy=0299_0171 region=1 run=R_2011_09_20_15_00 +_06_ CGAGTGAAGGCATTGGTGATGCTGGTGTGAAGAGTGAGGGCATCGCCAATGCCTTCACTC G
Each entry begins with a carot and fields are seperated by whitespace.
I want to replace all whitespace with commas, delete the text 'length=' from its corresponding field, remove all new lines/EOL markers, and seperate each entry with a new line.
I am new, but I figured out how to read/write files, replace whitespaces with commas, and chomp all new lines.
Here is the code I have so far:Here is an example of the output I want:use strict; use warnings; print "\n Running script for Jiggs \n"; my $infile=<foot.txt>; open(my $in,'<', "$infile") or die $!; open(my $out, '>' ,'foot1.txt') or die $!; my $line = <$in>; chomp($line); $line =~ s/ /,/g; print $out "$line"; $line =<$in>; print $out "$line"; while($line =<$in>) { chomp($line); $line =~ s/ /,/g; print $out "$line"; } close ($in); close ($out); print "\n Done!\n";
>G9JVYGV01AJE8V,135,xy=0104_0349,region=1,run=R_2011_09_20_15_00_06_GG +TGGTAGTGAAGAAGAGGAGATGAAAGTGGAAGAGGTTGAGGATGAGAAGGTTGAATTGGAAGAAGAAGA +TGAGAAGGTTGAAGTGGAAGATGAGAAGGTTGAAGTGGAAGAAGATGAAGTGGAAGAGAGGAGC >G9JVYGV01A4910,90,xy=0353_0150,region=1,run=R_2011_09_20_15_00_06_GGT +GCATGGCATTGTAGATGGTTGCTTGATAGTTGCCCATACGTGTACTACACTTGCAGAGTGAAGCAACCA +TCTACAATGCCATGCACC >G9JVYGV01A0SVP,70,xy=0302_0163,region=1,run=R_2011_09_20_15_00_06_GCA +CCATTCAGCACAGATATAGTAGCCACATCAACACAAGTTACCTAACTATATCTGTGCTGAATGGTGC >G9JVYGV01A221U,89,xy=0328_0160,region=1,run=R_2011_09_20_15_00_06_CTG +GACATTTACATCCATAAGTAGGAGTTAGGACTCTGCACCAGCCTCTTGAGCTTGTGACGTCTCTTCTCC +TCCTCCGGACTGGGACA >G9JVYGV01BVCPK,46,xy=0650_0134,region=1,run=R_2011_09_20_15_00_06_GCA +AGATCGCAAGCCAAGCAACGTTTCACGAACTGGCCAGAATGAG >G9JVYGV01AOU3I,81,xy=0166_0220,region=1,run=R_2011_09_20_15_00_06_TCA +TTGACATCTGTGCAGCTGCAGGAGCGGATATGAGGAGATGGTTCTATCTGCACAGATGTCAATGAGTGT +GACAGTGAT >G9JVYGV01A0JEL,61,xy=0299_0171,region=1,run=R_2011_09_20_15_00_06_CGA +GTGAAGGCATTGGTGATGCTGGTGTGAAGAGTGAGGGCATCGCCAATGCCTTCACTCG >G9JVYGV01AUKIG,119,xy=0231_0198,region=1,run=R_2011_09_20_15_00_06_GG +CCACCAGGGCTTAACTTCCTGTGCCTCACCATCACGCAGTTGTCAGAGGATCCACATTGAACAAAGTAG +CAATTCTTTCCACTCTGTGACACACCAACATTCTTATACAGCACCAGG >G9JVYGV01AJ8F7,29,xy=0113_1333,region=1,run=R_2011_09_20_15_00_06_CTG +CTTCCAAGCCTCCAACCTCTAACCAG>G9JVYGV01AMQ87,79,xy=0142_0233,region=1,ru +n=R_2011_09_20_15_00_06_AGAGTCTCCTCATTGTTCTTTCCAAGTCCTCTATTGCTGAGCCTG +GTTTCGTACCTTCTCAGCTAGGCCCTCTTTCTCT >G9JVYGV01A4W45,85,xy=0348_3895,region=1,run=R_2011_09_20_15_00_06_GCT +TCACATCTCAGAAATATAACCGCTAATGATCTGAAACAAGTTACAATCTGACATTCTGAAACCAAATGA +AAGCAGCATAAAC >G9JVYGV01A7TPA,66,xy=0382_0140,region=1,run=R_2011_09_20_15_00_06_ATG +GCTTACCTCACTGTCGATGGAGATCGAATGCAAGCGATGTCCATCGACAGTGAGGTAAGCCAT
I am using Windows 7 and Perl 5.12.4.
Your help is very much appreciated, I'll check back and update later today as I keep trying to work this out, thanks.
|
|---|