tmolosh has asked for the wisdom of the Perl Monks concerning the following question:
hi all - i have a workaround for this but want to understand why my 1st approach did not work. This relates to reporting on "direct repeats" in DNA|RNA sequences.
here is a command line example to generate a 1,000,000 sequence and try to find direct repeats of length > 3
perl -e '@a=qw(g a t c g a t c g a t c); for(1..1000000){$x=int(1000*r +and());$s.=$a[$x]} while($s=~/(?<repeat>\w{3,})\w*\g{repeat}/g){ $+{r +epeat};$c=()=$s=~/$+{repeat}/g;print "$+{repeat} : $c\n"}' agtcg : 0 agtcg : 0 agtcg : 0 ^C
if i remove the //g, it reports a count of 1 but no substring
perl -e '@a=qw(g a t c g a t c g a t c); for(1..1000000){$x=int(1000*r +and());$s.=$a[$x]} while($s=~/(?<repeat>\w{3,})\w*\g{repeat}/g){ $+{r +epeat};$c=()=$s=~/$+{repeat}/;print "$+{repeat} : $c\n"}' : 1 : 1 : 1 : 1
if i 1st assign $+{repeat} to a variable, i get the substring, but without //g, the count is still (as expected since it is not global)
perl -e '@a=qw(g a t c g a t c g a t c); for(1..1000000){$x=int(1000*r +and());$s.=$a[$x]} while($s=~/(?<repeat>\w{3,})\w*\g{repeat}/g){ $m=$ ++{repeat};$c=()=$s=~/$m/;print "$m : $c\n"}' gcacgt : 1 cga : 1 tggg : 1 ttg : 1 tta : 1 cct : 1
adding //g back gets back to the non-functional state:
perl -e '@a=qw(g a t c g a t c g a t c); for(1..1000000){$x=int(1000*r +and());$s.=$a[$x]} while($s=~/(?<repeat>\w{3,})\w*\g{repeat}/g){ $m=$ ++{repeat};$c=()=$s=~/$m/g;print "$m : $c\n"}' gctgca : 0 gctgca : 0 gctgca : 0 ^C
any explanation so i can better understand regexs would be greatly appreciated
thanks
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: count backrefenence regex
by haukex (Archbishop) on Oct 10, 2021 at 21:01 UTC | |
by tmolosh (Initiate) on Oct 11, 2021 at 02:07 UTC | |
|
Re: count backrefenence regex
by tybalt89 (Monsignor) on Oct 10, 2021 at 23:09 UTC | |
|
Re: count backrefenence regex
by LanX (Saint) on Oct 10, 2021 at 20:41 UTC | |
by tmolosh (Initiate) on Oct 11, 2021 at 02:33 UTC | |
by AnomalousMonk (Archbishop) on Oct 11, 2021 at 03:11 UTC | |
by LanX (Saint) on Oct 11, 2021 at 10:03 UTC | |
|
Re: count backrefenence regex
by tmolosh (Initiate) on Oct 16, 2021 at 14:05 UTC |