Hello Gideon, Bioinformatics related questions will garner more response if they were asked with less obfuscation since there are some great monks in here who are very able at answering almost every Perl question provided that it has been made as clear as possible, some of us know Perl but know no biology or GIS ...etc albeit they are very available, so you got to make it clearer to get accurate responses.

Read Perl and Bioinformatics in a welcoming note to you..

Now, a FastA file is just a string representing a nucleotide (DNA|RNA) sequence or a Protein sequence, Perl has different strong string manipulation approaches, you can access such a file format either directly (for input or output), check File Input and Output in the Tutorials or you can use one of the BioPerl modules to get an object that includes this file and which you can process further.

The best thing in BioPerl is that you don't have to worry about implementing parsing measures if the file format is recognized by the module you would use and that there is support for many formats (ex. Genbank, FastA...etc), you can even convert among these formats or generate a sequence from scratch in a particular format all the way through to very advanced bioinformatics tasks that involve sequence analysis.. The possibilities are just abound..

Here is a quick example to create an input sequence object and do a basic analysis..

use strict; use warnings; #counting motifs use Bio::SeqIO; my $file = "SeqHisham.txt"; my $in = Bio::SeqIO->new( -format =>'fasta', -file =>$file, ); my $motif_Count=0; my $motif = 'ga'; while(my $seq = $in->next_seq){ #since the sequence would be processed in a certain way conver +t to string my $string = $seq->seq; if($string=~/$motif/i){ #make the regex appropriate... $motif_Count++; } } print "Found $motif_Count hits\n";

on 'command prompt' that is another issue, you can do that using the @ARGV (check perlvar - Perl predefined variables documentation - for 'ARGV') or one of the Getopt modules, so start from basics and build-up... check www.bioPerl.org and their HowTos but before that invest time in learning enough Perl to get you started, check the Reviews section for book reviews and also work on identifying a general learning path.

Best of luck and have a nice Perl journey


Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.

In reply to Re: Read in fasta format by biohisham
in thread Read in fasta format by Gideon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.