>nameofsequence\n
ATCGTACGTTGCTE\n
>anothername\n
GTCTGT\n
so that a line starting with > containing a sequence name is followed by a line containing sequences nucleotide information
I am thinking of dredging them in 4 lines a time, because I have reasons to suspect that due to some certain previous operations there might be sequences directly following eachother with different names (on >sequencename\n line) but exactly the same sequence information (on following ATGCTGT\n line). Right now I'm looking to identify and remove such duplicates but I might make use of scripts dealing with many comparision extraction etc. of neighbouring sequences in my files. (Two neigbours means four lines)In reply to Re^4: Reading files n lines a time
by naturalsciences
in thread Reading files n lines a time
by naturalsciences
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |