BioUs2003 has asked for the wisdom of the Perl Monks concerning the following question:
Hey monks
I have a biology-programming question:
I have to write a program that should be able to deal with 1 or more protein sequences, digest them into smaller peptide sequences and report these back one per line. Only trick is that when Trypsin sees a K or an R letter it should split the sequence after the letter, unless the next letter (amino acid) is Proline (P). The program should be able to read in FASTA-format protein sequences and return the individual peptides after digestion. Also, it should consider missed cleavages too.
OK, so I think that the expression for splitting and cutting with trypsin is:
, but i am lost about the other stuff!= split /(/<=[KR])(?=[^P])/
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Trypsin digestion
by kennethk (Abbot) on Oct 22, 2015 at 16:05 UTC | |
by Anonymous Monk on Oct 23, 2015 at 05:33 UTC | |
by AnomalousMonk (Archbishop) on Oct 23, 2015 at 06:27 UTC | |
|
Re: Trypsin digestion
by talexb (Chancellor) on Oct 22, 2015 at 15:58 UTC | |
by BioUs2003 (Initiate) on Oct 22, 2015 at 17:01 UTC | |
by AnomalousMonk (Archbishop) on Oct 22, 2015 at 21:11 UTC |