MadraghRua has asked for the wisdom of the Perl Monks concerning the following question:
I'm interested in using weighted regular expressions in searching DNA/RNA sequences. So, say I take a set of information on important pieces of DNA. If I look at 12 pieces of DNA I might end up with the following matrix:
A T G C
25 25 25 25 nucleotide 1
10 15 50 25 nucleotide 2
0 90 5 5 nucleotide 3
12 16 32 40 nucleotide 4
where each of the numbers refers to the percentage weight (the percentage of that I can expect to see that nucleotide at that particular position).
So for Nucleotide 1, a simple regex would be atgc as everything is equally weighted.
Nucleotide 3 would be tgc as there is no weight for A.
I would like to write something that would pay attention to the weights at each position, not just the presence of nucleotides.
So nucleotide 3 would be t(90%)g(5%)c(5%) or whatever the correct regex pattern is.
Is this possible? Can anyone give me an example to send me on my way? I have looked in the Friedel book, but I didn't find anything terribly obvious...
Thanks for any help. Go raibh maith agat,
MadraghRua
yet another biologist hacking perl....
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
RE: weighting regex patterns
by KM (Priest) on Aug 18, 2000 at 23:04 UTC | |
|
RE: weighting regex patterns
by merlyn (Sage) on Aug 19, 2000 at 02:15 UTC | |
|
Re: weighting regex patterns
by MadraghRua (Vicar) on Aug 19, 2000 at 01:19 UTC | |
|
RE (tilly) 1: weighting regex patterns
by tilly (Archbishop) on Aug 18, 2000 at 23:25 UTC | |
|
Re: weighting regex patterns
by fundflow (Chaplain) on Aug 18, 2000 at 23:20 UTC | |
|
Re: weighting regex patterns
by athomason (Curate) on Aug 18, 2000 at 23:17 UTC | |
|
RE: weighting regex patterns
by jlistf (Monk) on Aug 18, 2000 at 23:09 UTC |