in reply to Re: very basic, just beginning
in thread very basic, just beginning

I'll second that. If you're doing any sort of biologic sequence manipulation, you must have bioperl.

Here's some incentive to get it installed... Run this with an argument of '*.txt'.

use Bio::Seq; use Bio::SeqIO; while(my $fname = shift) { open(INSEQ, "<$fname") or die $!; my $seq; ## Grab all the sequence in a single string while(<INSEQ>) { chomp; $seq .= $_; } close INSEQ; my $fname =~ s{\.txt}{}; # strip .txt ## Create a new Bio::Seq object with your sequence my $seqobj = Bio::Seq->new(-display_id => $fname, -seq => $seq ); ## Write it out to filename.fa my $outfile = $fname . ".fa"; my $seqout = Bio::SeqIO->new(-format => 'fasta', -file => "> $outfile", ); $seqout->write_seq($seqobj); }
It'll do exactly what you asked. Plus, just change 'fasta' to some other format, and it will convert that too.

Replies are listed 'Best First'.
Re: Re: Re: very basic, just beginning
by jmanning2k (Pilgrim) on Aug 27, 2003 at 18:44 UTC
    Well, I didn't realize your input files were all newline free.. Should have read the question more closely.

    In that case, Bioperl is even better...

    use Bio::SeqIO; while (my $fname = shift) { ## raw is one seq per line (no newlines) my $in = Bio::SeqIO->new(-file => "$fname", '-format' => 'raw'); $fname =~ s{\.txt}{}' # strip .txt my $outfile = $fname . ".fa"; my $out = Bio::SeqIO->new(-file => "> $outfile" , '-format' => 'fasta'); ## Do the conversion while ( my $seq = $in->next_seq() ) { $seq->display_id($fname); # Add a name $out->write_seq($seq); } }
    Run it in the same way, with an argument of *.txt