in reply to writing to arrays

This looks like a multi-fasta file holding DNA or protein sequence data (with sequence ID after the >). I use one of the following two ways to get this info into an array before piping into blast or other sequence manipulations. The first is a loop from some hand-me-down code that works quite well (but any comments etc on optimization etc v. welcome...)
open( FASTAFILE, $ARGV[0] ); while (<FASTAFILE>) { if ( /^>/ && $seqflag == 1 ) { push ( @sequences, $fasta ); $fasta = ""; $fasta = $_; } elsif (/^>/) { $fasta = $_; $seqflag = 1; } else { $fasta .= $_; } } push ( @sequences, $fasta ); #then iterate @sequences to run over BLAST
The other (better?) way is the very nice Bioperl modules that have methods that specifically handle multifasta flat files. Also check out EMBOSS, a sequence analysis suite that interfaces with BioPerl...EMBOSS + BioPerl makes life sooo much easier... From the bioperl tutorial...
# script 1: create the index use Bio::Index::Fasta; # using fasta file format $Index_File_Name = shift; $inx = Bio::Index::Fasta->new( -filename => $Index_File_Name, -write_flag => 1); $inx->make_index(@ARGV); # script 2: retrieve some files use Bio::Index::Fasta; $Index_File_Name = shift; $inx = Bio::Index::Fasta->new($Index_File_Name); foreach $id (@ARGV) { $seq = $inx->fetch($id); # Returns Bio::Seq object # do something with the sequence }
Hope this helps,

tandemrepeat

Replies are listed 'Best First'.
Re: Re: writing to arrays
by Anonymous Monk on Dec 26, 2002 at 17:20 UTC
    T.R Thanks for your comments. Indeed the file that i am playing with is a FASTA file which will be put thru BLAST eventuallay to generate some output. Thanks a lot 4 ur help! No more answers for this question reqd monks...thx 2 every1 that replied!