Hello, I have a fasta file with about 400 DNA sequences. I'm trying to come up with a perl script to remove all sequences that have less than 500 nucleotides and write the >500 nucleotides sequences to a new file. Here is an example of what the fasta file looks like:
>C1_A01_R.trimmed.seq (Quality-trimmed) Agencourt Bioscience Corporation ABI
GGCCGCCAGTGTGCTGGAATCCGCCCTTAACCTGGTTGATCCCGCCAGTAGTCATACGCT CGTCTCAAAGATTAAGCCATGCATGTCTAAGTATAACTCTTTTACTTTGAAAACTGCGAA CGGCTCATTATATCAGTTATAGTTTATTTGATAGTCCCTTACTACTTGGATACCCGTAGT AATTCTAGAGCTAATACATGCATCAATACCCAACTGTTCGCGGAAGGGTAGTATTTATTA GGTATAGACCAACCGTCTTCGGACGTGCTTTGGTGATTCATAATAACTTTTCGAATCGCA TGGCTCCATGCCGGCGATGGATCATTCAAGTTTCTGCCCTATCAGCTTTGG>C1_A03_R.trimmed.seq (Quality-trimmed) Agencourt Bioscience Corporation ABI
CCGAAGTAATTCTAGAGCTAATACATGCA>C1_A04_R.trimmed.seq (Quality-trimmed) Agencourt Bioscience Corporation ABI
TAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTAACCTGGTTGATCCTGCCAGTAGTC ATACGCTCGTCTCAAAGATTAGGCCATGCATGTCTAAGTATAACTCTTTTACTTTGAAAA CTGCGAACGGCTCATTATATCAGTTATAGTTTATTTGATAGTCCCTTACTACTTGGATAC CCGTAGTAATTCTAGAGCTAATACATGCATCAATACCCGACTGTTCGCGGAAGGGTAGTA TTTATTAGGTATAGACCAACCGTCTTCGGACGTGCTTTGGTGATTCATAATAACTTTTCG AATCGCATGGCTCCATGCCGGCGATGGATCATTCAAGTTTCTGCCCTATCAGCTTTGGAT GGTAGTGTATTGGACTACCATGGCTTTAACGGGTAACGAATTGTTAGGGCAAGATTTCGG AGAGGGAGCCTGAGAGACGGCTACCACATCCAAGGAAGGCAGCGGGCGCGTAAATTACCCDoes anyone out there have any scripting suggestions????
In reply to Parse DNA fasta file by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |