I have an input file containing MANY records in this format:
>Record 1 AGTCTAGTCAT CATCATAAGAT CATCAATCACA >Other Record ATGAACAGCAG ATGAAGAATGG ATAG
Ah, yes. The ol' FASTA. IF you know that you have no repeats in the id/description info after the '>', the following "one liner" gets you a long way with a standard FASTA file:
The uniqueness condition mentioned above is crucial, otherwise this scriptlet will mess with your mind.% perl -lne 's/>//?$s=$_:$s{$s}.=$_;\ > END{ <do smthng w/ %s> }' \ > huge.fasta massive.fasta humongo.fasta
A generically useful specialization of the above is
Then you can read the-hash-formerly-known-as-%s from any script whenever you please. See Storable. Keep in mind, however, that, if left unattended, Storable::store clobbers without remorse.% perl -MStorable -lne 's/>//?$s=$_:$s{$s}.=$_;\ > END{ store \%s "for_later" }' \ > huge.fasta massive.fasta humongo.fasta
the lowliest monk
In reply to Re: Input record separator
by tlm
in thread Input record separator
by travisbickle34
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |