in reply to Re: Bioinformatics: Regex loop, no output
in thread Bioinformatics: Regex loop, no output
The output from your code shows some problems:
The K or R terminating split codon (if that's the proper term) is being incorrectly removed from the output peptides. (At least, I think this is incorrect. TamaDP doesn't show desired output, but seems satisfied with output examples given in various replies in this thread that include these codons.) So I assume GSDVN should really be GSDVNR and the "null" sequence following it should really be the single-codon sequence R. This is all down to the incorrect definition of the s/// match pattern; take a look at some other replies in this thread for what I feel are more correct s/// patterns.The peptide is DAAAAATTLTTTAMTTTTTTC The peptide is MMFRPPPPPGGGGGGGGGGGG The peptide is ALTAMCMNVWEITYH The peptide is GSDVN The peptide is The peptide is ASFAQPPPQPPPPLLAIKPASDASD
In an unrelated note, the regex in the condition expression of the
if ($protein =~ m/[K(?!P)|R(?!P)]/g) { ... }
block isn't doing what I think you think it's doing. The [K(?!P)|R(?!P)] character class is exactly equivalent to the [KPR()?!|] class; metacharacters (alternations, groupings, etc.) have no meaning in a character class, so ()?!| are just literal characters (and repeated characters have no effect whatsoever). Also, the /g modifier in the m//g match is useless in the boolean context of a conditional, although it does no harm (except to burn a few more innocent computrons). Again, all this doesn't affect the basic problem with the code, which stems from the incorrect s/// match.
I use Data::Dumper all the time because I've been fooled by my data too many times.
Yea and amen brother, yea and amen.
Give a man a fish: <%-{-{-{-<
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Bioinformatics: Regex loop, no output
by tonto (Friar) on Nov 17, 2015 at 21:05 UTC |