in reply to Re: How to match the sequences with headers
in thread How to match the sequences with headers
The sample input is:
>101M:A:sequence MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRVKHLKTEAEMKASEDLKKHGVTVL +TALGA ILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPGNFGADAQGAMNKALELFRKDIAA +KYKEL GYQG >101M:A:secstr HHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHH GGGGGG TTTTT SHHHHHH HHHHHHHHHHH +HHHHH HHTTTT HHHHHHHHHHHHHTS HHHHHHHHHHHHHHHHHH GGG SHHHHHHHHHHHHHHHHHHHH +HHHHT T >102L:A:sequence MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAAKSELDKAIGRNTNGVITKDEAEKLFNQ +DVDAA VRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSRWYNQTPN +RAKRV ITTFRTGTWDAYKNL >102L:A:secstr HHHHHHHHH EEEEEE TTS EEEETTEEEESSS TTTHHHHHHHHHHTS TTB HHHHHHHHHH +HHHHH HHHHHH TTHHHHHHHS HHHHHHHHHHHHHHHHHHHHT HHHHHHHHTT HHHHHHHHHSSHHHHHSHH +HHHHH HHHHHHSSSGGG
The first output should be all protein fasta sequences like:
>101M:A:sequence MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRVKHLKTEAEMKASEDLKKHGVTVL +TALGA ILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPGNFGADAQGAMNKALELFRKDIAA +KYKEL GYQG >102L:A:sequence MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAAKSELDKAIGRNTNGVITKDEAEKLFNQ +DVDAA VRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSRWYNQTPN +RAKRV ITTFRTGTWDAYKNL
The second Output file should hav secondary structures:
Thanks..>101M:A:secstr HHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHH GGGGGG TTTTT SHHHHHH HHHHHHHHHHH +HHHHH HHTTTT HHHHHHHHHHHHHTS HHHHHHHHHHHHHHHHHH GGG SHHHHHHHHHHHHHHHHHHHH +HHHHT T >102L:A:secstr HHHHHHHHH EEEEEE TTS EEEETTEEEESSS TTTHHHHHHHHHHTS TTB HHHHHHHHHH +HHHHH HHHHHH TTHHHHHHHS HHHHHHHHHHHHHHHHHHHHT HHHHHHHHTT HHHHHHHHHSSHHHHHSHH +HHHHH HHHHHHSSSGGG
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: How to match the sequences with headers
by Lotus1 (Vicar) on Oct 23, 2011 at 16:07 UTC |