Hi nemo2, I am a begginer too, but I want to suggest you some tips that will help you in this forum:

- read the posts regarding the rules in posting, the formmating to tips, etc. If you follow the rules you will have more responses

- second, you should explain your problem thinking that you are talking to computer people, and not to biologists. I believe that most of the people here doesn't know anything about the fasta format, or sequences, or gene names, and they don't care and they don't even have to know anything about biology to help you. So, a fasta header is just a line that starts with ">", and a sequence is just a string, or several lines below that line with the ">" symbol. Reflect the format of the text file in your post, so people know what you are talking about.

And now, it is not a hard job what you have to do, but you need to know some of perl, like:

- reading input files and writing to output files

- using regular expressions to "capture the fasta headers and the corresponding sequences

- using hashes to store the sequence_name and the sequence as pair of key-values and create a lookup table

- and probably some more...

A good book as an introduction of Perl for biologists, and how you can use Perl in your bioinformatic tasks is "Beggining Perl for bioinformatics", James Tisdall, ed. O'Reilly. There you can find what you need to start with perl in bioinformatics, using examples from biology

I hope this can help you


In reply to Re: comparing two fasta files by rogerd
in thread comparing two fasta files by nemo2

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.