in reply to How to format fasta file

Hi andyBio! I'm a PerlMonk, not a BioMonk: you will have to lead me by the hand.

... I want to: ... delete reads ... [c]ount ... read[s] ... [d]elete non-unique reads.

What's a "read"? Also, the sequences in both records of your input example are 'ATGGCTATCGATT', but while the first example output record is 'ATGGCTATCGATT' (with four reads in it), the second output record (from the same input data) is 'TGCATGCGCTACG' with seven (!) reads. More confusion — at least for me. Please see I know what I mean. Why don't you?

Update: A Wikipedia link ([wp://...]) or suchlike to a discussion of what a "read" is might be helpful. Please see What shortcuts can I use for linking to other information?


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^2: How to format fasta file
by FreeBeerReekingMonk (Deacon) on Apr 10, 2016 at 23:18 UTC
    Indeed.

    How should we know how many characters one BP (base pair) is? In this case: 1 byte (a single A,T,G,C letter represents a basepair)