in reply to How to match the sequences with headers

Without sample input, expected output, and how it doesn't match, its hard to even attempt to help :)

Short, Self Contained, Correct Example How do I post a question effectively?

Generic bioperl advice, see Bio::SeqIO/http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/SeqIO.html/http://www.bioperl.org/wiki/Main_Page

  • Comment on Re: How to match the sequences with headers

Replies are listed 'Best First'.
Re^2: How to match the sequences with headers
by anonym (Acolyte) on Oct 23, 2011 at 14:43 UTC

    The sample input is:

    >101M:A:sequence MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRVKHLKTEAEMKASEDLKKHGVTVL +TALGA ILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPGNFGADAQGAMNKALELFRKDIAA +KYKEL GYQG >101M:A:secstr HHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHH GGGGGG TTTTT SHHHHHH HHHHHHHHHHH +HHHHH HHTTTT HHHHHHHHHHHHHTS HHHHHHHHHHHHHHHHHH GGG SHHHHHHHHHHHHHHHHHHHH +HHHHT T >102L:A:sequence MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAAKSELDKAIGRNTNGVITKDEAEKLFNQ +DVDAA VRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSRWYNQTPN +RAKRV ITTFRTGTWDAYKNL >102L:A:secstr HHHHHHHHH EEEEEE TTS EEEETTEEEESSS TTTHHHHHHHHHHTS TTB HHHHHHHHHH +HHHHH HHHHHH TTHHHHHHHS HHHHHHHHHHHHHHHHHHHHT HHHHHHHHTT HHHHHHHHHSSHHHHHSHH +HHHHH HHHHHHSSSGGG

    The first output should be all protein fasta sequences like:

    >101M:A:sequence MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRVKHLKTEAEMKASEDLKKHGVTVL +TALGA ILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPGNFGADAQGAMNKALELFRKDIAA +KYKEL GYQG >102L:A:sequence MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAAKSELDKAIGRNTNGVITKDEAEKLFNQ +DVDAA VRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSRWYNQTPN +RAKRV ITTFRTGTWDAYKNL

    The second Output file should hav secondary structures:

    >101M:A:secstr HHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHH GGGGGG TTTTT SHHHHHH HHHHHHHHHHH +HHHHH HHTTTT HHHHHHHHHHHHHTS HHHHHHHHHHHHHHHHHH GGG SHHHHHHHHHHHHHHHHHHHH +HHHHT T >102L:A:secstr HHHHHHHHH EEEEEE TTS EEEETTEEEESSS TTTHHHHHHHHHHTS TTB HHHHHHHHHH +HHHHH HHHHHH TTHHHHHHHS HHHHHHHHHHHHHHHHHHHHT HHHHHHHHTT HHHHHHHHHSSHHHHHSHH +HHHHH HHHHHHSSSGGG
    Thanks..

      At the beginning of your program put in $/="\n\n";

      Then inside the while loop if $_ matches 'sequence' print $_ to out1; if it matches 'secstr' print $_ to out2.