bugiep has asked for the wisdom of the Perl Monks concerning the following question:
Hi Everyone,
I am doing a regex on my FASTA data which contains multiple protein sequences and their description as a header line. My output should consist of a header line and the sequence. When i print it to FILE my first header is missing. all other header lines are present. Any insight would be very welcome. below is my perl script:
use strict; use warnings; open IN, "zea_mays.txt"; open OUT, ">zea_mays1.txt"; my @peptides; my $seq; my $flag = 0; while(my $line = <IN>){ chomp($line); #check the chomp function if ($line =~ /^>/) { if ($flag == 0){ #the first protein entry, which means nothing + in the memory to do $flag = 1; next; } print "\n"; print OUT " $line\n "; }else { $line =~ s/\s//g; my @peptides = split(/(?<=[RK](?!P))/,$line); print OUT "@peptides\n"; } } close IN; close OUT;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: First header missing from FASTA data
by tangent (Parson) on Mar 15, 2012 at 01:42 UTC | |
|
Re: First header missing from FASTA data
by Khen1950fx (Canon) on Mar 15, 2012 at 03:13 UTC | |
by Anonymous Monk on Mar 15, 2012 at 03:51 UTC | |
|
Re: First header missing from FASTA data
by bugiep (Initiate) on Mar 15, 2012 at 12:10 UTC |