Re^2: Comparing Files

Sadly it is more fun to use the same word for multiple things in biology...

FASTA can be the sequence format (also called Pearson format). FASTA is also a program for searching sequences by aligning them.

As to what the poster is really after, I think it is the output of the latter in order to generate a list of sequences which are significantly similar the input query set. This is something you might do if you are trying to build the a gene family which is made up of similar sequences.

But at this point we are just playing guessing games so it will have to be clarified by the poster as to what they want. Honestly this is not the best forum to ask these questions - consider posting to the Bioperl list if you have bioinformatics+perl questions or else spend a little more time explaining the algorithm you are trying to write.

As I have already posted Re: Fasta Using Perl, it is possible to parse the output from FASTA with Bio::SearchIO and to parse the sequence files with Bio::SeqIO, databases of sequences with local Indexes Bio::DB::Fasta, Bio::Index::Fasta and friends.

Comment on Re^2: Comparing Files