Hi, here is one solution. I'm using Path::Tiny for file handling. Also localizing the special variable $/ (the input record separator), as you hinted at.
I used the following input file:use strict; use warnings; use feature 'say'; use Path::Tiny; local $/ = '>'; my $fh = path('./foo.fasta')->openr; while ( my $paragraph = <$fh> ) { chomp $paragraph; my @lines = split /\n/, $paragraph or next; my ( $identifier, $string ); for my $line ( @lines ) { if ( $line =~ /(sequence\d+)/ ) { $identifier = $1; } else { $string .= $line; } } say "$identifier\t$string"; } __END__
and got the following output:>sequence1 ACTCCCCGTGCGCGCCCGGCCCGTAGCGTCCTCGTCGCCGCCCCTCGTCTCGCAGCCGCAGCCCGCGTGG ACGCTCTCGCCTGAGCGCCGCGGACTAGCCCGGGTGGCC > sequence2 CAGTCCGGCAGCGCCGGGGTTAAGCGGCCCAAGTAAACGTAGCGCAGCGATCGGCGCCGGAGATTCGCGA ACCCGACACTCCGCGCCGCCCGCCGGCCAGGACCCGCGGCGCGATCGCGGCGCCGCGCTACAGCCAGCCT CACTGGCGCGCGGGCGAGCGCACGGGCGCTC >randomstuff sequence3 CACGACAGGCCCGCTGAGGCTTGTGCCAGACCTTGGAAACCTCAGGTATATACCTTTCCAGACGCGGGAT CTCCCCTCCCC > sequence4 blahblah CAGCAGACATCTGAATGAAGAAGAGGGTGCCAGCGGGTATGAGGAGTGCATTATCGTTAATGGGAACTTC AGTGACCAGTCCTCAGACACGAAGGATGCTCCCTCACCCCCAGTCTTGGAGGCAATCTGCACAGAGCCAG TCTGCACACC
$ perl foo.pl sequence1 ACTCCCCGTGCGCGCCCGGCCCGTAGCGTCCTCGTCGCCGCCCCTCGTCTCGCAGCC +GCAGCCCGCGTGGACGCTCTCGCCTGAGCGCCGCGGACTAGCCCGGGTGGCC sequence2 CAGTCCGGCAGCGCCGGGGTTAAGCGGCCCAAGTAAACGTAGCGCAGCGATCGGCGC +CGGAGATTCGCGAACCCGACACTCCGCGCCGCCCGCCGGCCAGGACCCGCGGCGCGATCGCGGCGCCGC +GCTACAGCCAGCCTCACTGGCGCGCGGGCGAGCGCACGGGCGCTC sequence3 CACGACAGGCCCGCTGAGGCTTGTGCCAGACCTTGGAAACCTCAGGTATATACCTTT +CCAGACGCGGGATCTCCCCTCCCC sequence4 CAGCAGACATCTGAATGAAGAAGAGGGTGCCAGCGGGTATGAGGAGTGCATTATCGT +TAATGGGAACTTCAGTGACCAGTCCTCAGACACGAAGGATGCTCCCTCACCCCCAGTCTTGGAGGCAAT +CTGCACAGAGCCAGTCTGCACACC
Hope this helps!
In reply to Re: Converting fasta (with multiple sequences) into tabular using perl
by 1nickt
in thread Converting fasta (with multiple sequences) into tabular using perl
by rarenas
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |