Gideon has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks i am a biologist trying to learn perl programming on command prompt windows how do i get to read into the fast format thanx

Replies are listed 'Best First'.
Re: Read in fasta format
by ww (Archbishop) on May 11, 2010 at 22:37 UTC

    Perhaps you can clarify your question a bit. As I read it, there are at least two possible meanings: Are you seeking:

    1. help on how to read a file in fasta format into your program?
      or
    2. suggestions about where to "read into" (in the British sense of the phrase) the fasta format?
      I think the generic English for the latter would be: "read about the fasta format;" "study the fasta format" or similar).

    http://bioinformaticsweb.net/tutorial.html may lead you to answers on either. Or, perhaps, one of the bio-monks will happen by and have a better idea, but in the meantime, you may wish to use Super Search with fasta, bioinformatics or bioperl as a search term. Here's a small sample of what's available here.

Re: Read in fasta format
by biohisham (Priest) on May 12, 2010 at 06:12 UTC
    Hello Gideon, Bioinformatics related questions will garner more response if they were asked with less obfuscation since there are some great monks in here who are very able at answering almost every Perl question provided that it has been made as clear as possible, some of us know Perl but know no biology or GIS ...etc albeit they are very available, so you got to make it clearer to get accurate responses.

    Read Perl and Bioinformatics in a welcoming note to you..

    Now, a FastA file is just a string representing a nucleotide (DNA|RNA) sequence or a Protein sequence, Perl has different strong string manipulation approaches, you can access such a file format either directly (for input or output), check File Input and Output in the Tutorials or you can use one of the BioPerl modules to get an object that includes this file and which you can process further.

    The best thing in BioPerl is that you don't have to worry about implementing parsing measures if the file format is recognized by the module you would use and that there is support for many formats (ex. Genbank, FastA...etc), you can even convert among these formats or generate a sequence from scratch in a particular format all the way through to very advanced bioinformatics tasks that involve sequence analysis.. The possibilities are just abound..

    Here is a quick example to create an input sequence object and do a basic analysis..

    use strict; use warnings; #counting motifs use Bio::SeqIO; my $file = "SeqHisham.txt"; my $in = Bio::SeqIO->new( -format =>'fasta', -file =>$file, ); my $motif_Count=0; my $motif = 'ga'; while(my $seq = $in->next_seq){ #since the sequence would be processed in a certain way conver +t to string my $string = $seq->seq; if($string=~/$motif/i){ #make the regex appropriate... $motif_Count++; } } print "Found $motif_Count hits\n";

    on 'command prompt' that is another issue, you can do that using the @ARGV (check perlvar - Perl predefined variables documentation - for 'ARGV') or one of the Getopt modules, so start from basics and build-up... check www.bioPerl.org and their HowTos but before that invest time in learning enough Perl to get you started, check the Reviews section for book reviews and also work on identifying a general learning path.

    Best of luck and have a nice Perl journey


    Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.
Re: Read in fasta format
by Khen1950fx (Canon) on May 12, 2010 at 00:03 UTC
    Here's a tutorial from the bioperl wiki that will get you up and running.
Re: Read in fasta format
by llancet (Friar) on May 12, 2010 at 03:07 UTC
    See perlboot, then look at Bio::SeqIO. You may also write a parser by yourself, as FASTA format is simple.