Better now that you formatted the files' content! But it's still not clear what your objective is. How did the text under 'contig1' and 'contig4' get processed into the output file? Can you show the code for that? Why didn't the text under 'contig2' and 'contig3' get processed correctly? You need to understand why, otherwise you won't be able to fix the problem and it might happen again.
So please post the code that you use so far ...
If your problem is that when you processed your data file, it only processed the first and the last sections, maybe you just need a better regexp? You can get the strings out of your data file very easily with Perl.
#!/usr/bin/env perl use strict; use warnings; use File::Slurp::Tiny 'read_file'; my $file = 'contig.txt'; my $slurp = read_file($file); my %results; while ( $slurp =~ />Contig(\d+).+?([A-Z]+)/sg ) { $results{ $1 } = $2; } foreach my $test_number (sort keys %results ) { print "Test $test_number: $results{ $test_number }\n\n"; ## do something to process $test_number, $results{ $test_number } .. +. } __END__ OUTPUT: Test 1: GAGCTAAATAATTTGAATCAATGGGAAGATCACCGTGTTGTGAAAAAGCACATACAAATAAA +GGAGCTTGGACTAAAGAAGAAGATGAACGACTTATTTCTTATATTAAAACTCACGGCGAAGGTTGCTGG +AGATCCCTTCCTAAAGCTGCCGGACTTCTCCGATGCGGTAAAAGTTGCCGTCTCCGATGGATTAATTAC +TTGAGACCGGACCTTAAACGCGGTAATTTTACTGAAGAAGAAGATGAACTCATTATCAAACTCCATAGC +CTCCTTGGTAACAAATGGTCACTTATAGCCGGAAGATTACCAGGAAGAACAGATAATGAGATAAAAAAT +TACTGGAATACGCACATAAGAAGGAAGCTTTTGAGTCGGGGCATTGATCCAACGACACACAGGCCTGTT +AACGAGCCTGGTACAACGCAAAAAGTCACAACAATTTCATTTGCAGGTGGAGATCATAAAACTAAAGAT +ATTGAAGAAGATCATAATAAGATGATAAATGTCAAAGCTGAATCTGGGTTGAGTCAATTAGAAGATGAA +ATTATTAGTAGCAGTCCATTTCGAGAACAGTGTCCTGATTTAAATCTTGAGCTCAAATTAGCCCTCCTT +CTCTACAAAATTACCAACATAGCCCCTCAAGGTGTTTTGCATGCAGTTTGGGTATACAAAATAGTAAAG +ATTGCAATTGCAGTAAAAATAATATTGCAAGTTATAACTTTTTAGGATTAAAGAGTAATGGTGTTTTGG +ACTATAGAACTTTAGAAACTAAGTGAATTTTTATTATAAATCTTTTTTTCCCTCGTGTATTTGGGTTAA +AAAAACAAGAAGAGAGAATCGAGAAAGATATTCCTATTAGTTTAAGTTCTTTCGAATTTTCTCTTATTT +GTAAAATTTCAAGTATTACTATATACGATATATTATATTAAGTTGAAAAG Test 2: GCTCTTCCAACAACAACAACAATGCCTCATCAAAAGCCTCTTTCTCTCATTCTTCTATCTAC +ACTCCCACTTCTTTTCATTCTCACACAAGCTCAATCACCAACAGCACCAGCACCAGCACCCTCAGGACC +AATAGACATCTTTGCAATCCTCAAAAAAGAAGGACAATACAACACATTCATCAAGTTCCTAAATGAATC +ACAAGTTGGTAACCAAATCAACAACCAAGTAAACAACTCCAACCAAGGCATGACAGTTTTGGCACCATC +AGACAATGCATTTAACAACCTCCCAAGTGGTACACTCAACCAACTAAATGACCAACAAAAAGTACAACT +CATTTTGAACCATGTCATACCAAAGTTCTACACATTTGATGACTTACAAACAGTAAGCAACCCTGTTAG +AACACAAGCAACAGGGCCTAAAGGTGAGCCTTTTGGACTTAACTTTACTGGAAGTAACAATCAAGTGAA +TGTCTCATCTGGTTCTGTTGTTACAAACATTTATAATGCTATTAGAAAAGACCCCCCATTGGCTGTTTT +TCAATTAGACAAAGTTTTAGTACCTTCTCAGTTTACTGATCCATCTAGTGATGATGATGCCCCTGCACC +TACTAAACCCAAGAATGGTACTAGTAATGATAAAACAACAGCTGATGAGCCATCACCAGCAAGTAACAC +TAAGCCAAATGATGCTAAAAGGATCAGTGGTGGGATTCTTGGATTGGTTTGTGGTGTTTTCTTGATGGC +AACACTATCTTGAAGGGGGCTACAGAGTTGTTAACTTTATGATCTTTTGCTTATACTAAGCCATTTTGT +ATTACATTGTTTTCTTCAAGATTGATTGTTTTTGTTCAAAAAAGAAGGGGGGGGGGGAAAAAAAAACCC +CCCTGCGGAAAAGAGCGGGGAAAGCACCAAAAAGCCACCGACCAAAAGCACCAACTCACAAAAGGTGCG +CAGACGCGGAAAGGGGAAAAGGAAAAAATGTGAAAGCTTGTTATAGTTTG Test 3: AAACTGTAATTAGACTTCTCTGCTAAGTTTCTGCTGTATTTGGATTCTCCGGCGAACATTAA +TATCTAACCATGACCGGCGGTGGAGGCGATGCCGCATCGCCGCCTCTATCCTCACAGTCAACTCCATCC +AACGGTGGGGAATTCCTTCTTCAATTGCTTCAGAATCATCCGCATCAACTTCACTCTCAGCCTCAACCG +CCACTGCGGCCGGAGTTGCAGAATCTGCCGCATGATCCAGCAGTTGCAGCAGTAGGTCCTAGTATGCCC +TACCCGCCATTGTTCCATACTCCTACAAACCCTTCTGTTTTGCCCTATTCTCACTCTCCTCCTCTGTTT +GTACCTCATAACTTCTTCATTCGAGGGTTTCTCCAAAACCCTAATTCTGGCCATACCACTAACCCCAAT +TACTCATCTCCGCCTGCCCCAAGTGGGTTCAGTCAATATCACCATGCGAGTCCACTTGGATTTGGATCA +GTCGGAGAAAACATGGGCAATTTGGGGATTTTCGGTGCCAATGCTAAGGCGAG Test 4: CATGTAATAGCATAGCATCCCCAATTTCACCCTCTCATGGCCATGTCCACGCTCCTCTCCCT +GTCCGTGTCTATCCACCCACCAAAACCTTTGCAAAAACCCAATTCAATGTGTACCCAACCTAACTCTAT +TTCGAGAAGACAAGTGTTTTTCACTGGTTCTAATTTATTGCTCTCTCAATTAATTCCAAAATCCGACGC +CCAAACCAATTCCAATAGTTTTCTTTCAGGTATTGCCAATACTAAGTCTTGGTTCCAATTCTATGGCGA +CGGCTTTTCTATTCGTGTTCCACCGGAATTTCAGGACCTCACTGAGCCGGAGGATTATAATGCTGGCCT +ATCACTATATGGAGATAAGGCTAAGCCCAAAAAATTTGCAGCACGTTTTGCTTCTTCTGATGGATCCGA +AGTTTTAAGTGTCATAATTCGTCCATCCAATCAGCTGAAGATCACTTTCTTAGAGGCTAAAGATATTAC +TGATTTAGGTTCACTTAAGGAGGCAGCAAAAATATTTGTTCCAGCTGGCTCAACACTATATTCTGTCCG +CACAATAAAAATTAAAGAAGATGAGGGTTTCAGGACATACTATTTTTATGAATTTGTGAGAAATGAGCA +ACACGTTGCATTAGTGGCTGGTGTTAACAGTGGAAAGGCCGTCATTGCTGGTGCCACGGCCCCCGAAAG +CAAATGGGCCGAGGATGGTTTGAAGCTCCGATCTGCTGCAGTATCAATGACAATTCTATAAGCAGAATG +TGAGTATATATATAGGTTCTATTTCAATGATGATGAATTTATATACAAATATTGAGGATCAAAGTTTTC +TTATTATCATCTAATCTCAGCCAAGGATTAACAAT CTCCATCATCCATTCAATAGCAATGTTTCTGCTGTTTTGC
In reply to Re: how to create output file using perl
by 1nickt
in thread how to create output file using perl
by vineetha
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |