So my general problem is one in bioinformatics: given the entire proteome of a species, where each gene can be spliced into multiple protein isoforms of different lengths, pick only the longest protein isoform for each gene
In other words, if there are genes A, B, C etc.
with respectively 3, 1 and 2 protein isoform(s)
a.1, a.2, a.3, b.1, c.2, c.5, c.7, and
of respective lengths 12, 11, 12, 15, 34, 12, 45, and
and with their corresponding peptide sequences,
then, I want the PERL script to return ONLY a.1 or a.3 for gene A and its corresponding sequence, b.1 and its sequence for gene B, c.7 and its sequence for gene C... you get the idea?
Your suggested syntax using grep worked, except that it returned all the matches into an array, so I used shift to gather the first match (which for my purposes is the same as any other matches, if multiple keys are present as matches to my values in the array)
Thanks for the useful syntax, I had not come across it yet in the Beginning Perl 3rd edition book, and its only been 10 days since I started teaching myself PERL! So your help is much appreciated...
I can this my "non-redundification" PRL script that removes protein isoform redundancy by selecting ONLY the longest isoform for each gene in a proteome. My script is ~100 lines long, which I think would be a joke for you Monks! But hey, I am just a Padawan learner as of now! :)
Thanks again to both of you!
In reply to Re^2: Extract hash keys for values stoted in array
by onlyIDleft
in thread Extract hash keys for values stoted in array
by onlyIDleft
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |