ad23 has asked for the wisdom of the Perl Monks concerning the following question:
Hello All,
I am trying to parse a fasta file using perl. The following is the input file:
>CVSF43565.d1 bg|346278 CAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGG ATCCGGGTAAGCAACGGGTCCACATATCTCCACAATCTCATAAGGGGCCAACATAGCGGGGGAGCTAACT TGCCTTTGATTCCAAACCGTTGCACTCCTTTGGTCGGGGAAACTCGAAGGTACACATGATCACCAAGGTC GAACTGCAGGGGTCTTCTCCGCTGGTCGGAGTAGCTCTTCTGTCAAGATTGGGCGGCCTTGAGATGTGCT TGAATTACCTTCACTTGCTCTTCGGCTTCTGCCACTTAAGTCAGGGCCATAGACCTGTCTCTCCCCTGGG CAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGG ATCCGGGTAAGC >CVSF43566.d1 bg|346279 CAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGG ATCCGGGTAAGCCAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGT TCTGGGAGTTGGAATGCTAGTCGATCGCCAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATG TTGGCAGACAACTGTTCTGGGAGTTGGATCCGGGTAAGCCAGACACACTTCTTTTAGTTGAGACACATGG AAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGGATCCGGGTAAGC >CVSF43567.d1 bg|346280 CGTAGCTGATGCTGTGCTGTTGTGTCGGGGGGATATATATATATATATGGGGTCGTAGTCGTAGCGCTAG TATGCTAGCAGCGTAGATGCTGATCGATGCTGATGCTGATCGTAGTCGTAGGCTAGTGCGATCGTAGTCG TAGTCGATGCTGATGCGTAGCTGATGTGCTGCTGATGCTAGTCGTCGTAGCTGATGCATGCTGATCGTAG TGCTCGATGCTAGTCGTAGTCGTAGTCGTAGCGACTGATGCGATCGTAGTCGGATGCTAGCACGTAGCTG GCTCGATGCTGATGCTGAT >CVSF10000.x1 bg|356789 pair:789860 ATGCGTAGCTGATGTGCTGCTGATGCTAGTCGTCGTAGCTGATGCATGCTGATCGTAGTGCTCGATGCTA GTCGTAGTCGTAGTCGTAGCGACTGATGCGATCGTAGTCGGATGATGCTGACTGATGCTGATCTGTACGT CGTAGCTGATGCATGCGCTAGTAGCT >CVSF10000.y1 bg|356790 pair:789859 GCTAGTCGATGCTGATGCTGTAGCTAGCGTAGTCGTACGCGCGCGCGCGCGTTTTTTGTGACGTCGTAGT CCGTAGCTGATGCGATGCTAGTGCTGTGTCAGCTGATGTCGTGTGTAGCTGATGCTGATCGTTCGTGTGT CGATGCTGATGCTAGTCGTAGTGTAT >CVSF10001.x1 bg|356791 pair:789862 AGTCGTAGTCGTAGCTGTAGCTGATGCTGTGTACGATGCTGATGCGATGCGTAGCGTAGCATCGATGCTA CGACTAGTCGTAGTCGTC >CVSF10001.y1 bg|356792 pair:789861 CGTAGCTGATGCTGATCGTAGTCGTAGTCGATGCGATGCTAGTCGTAGCTGTAGCTGATGCTGCGTGCTG CAGTCGATGCTAGTCGATGCTGATCGTCTAGCAT
I want to write the lines(and the data that follows) with "pairs" field in one file and the lines without "pairs" field in another.
However, with the following code I am only able to write the header lines. But I also want the data following the header line(ATGCTAGCTG....) to be included in the output files.
Any inputs??
#!/usr/bin/perl my $in = $ARGV[0]; my $p = $ARGV[1]; my $s = $ARGV[2]; open IN, "<$in" or die $!; open P_OUT, ">$p" or die $!; open S_OUT, ">$s" or die $!; while(<IN>){ chomp; if(/^>/){ my @header = split / /; if($header[2] ne ''){ print P_OUT "$header[0]"." "."$header[1]"." "."$header[2]\n"; } else{ print S_OUT "$header[0]"." "."$header[1]\n"; } } #unless(/^>/){ #print OUT "$_\n"; #next; #} } close(IN); close(P_OUT); close(S_OUT);
Thanks!!!
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Print the data following few specific lines in perl
by BioLion (Curate) on Jul 01, 2010 at 16:36 UTC | |
by ad23 (Acolyte) on Jul 01, 2010 at 17:47 UTC | |
by BioLion (Curate) on Jul 05, 2010 at 08:02 UTC | |
Re: Print the data following few specific lines in perl
by toolic (Bishop) on Jul 01, 2010 at 18:06 UTC | |
by ad23 (Acolyte) on Jul 01, 2010 at 18:30 UTC | |
by toolic (Bishop) on Jul 01, 2010 at 18:40 UTC | |
by ad23 (Acolyte) on Jul 01, 2010 at 18:48 UTC | |
Re: Print the data following few specific lines in perl
by biohisham (Priest) on Jul 01, 2010 at 23:37 UTC |