in reply to Re^2: count backrefenence regex
in thread count backrefenence regex
you didn't
DB<269> x "GATCGGGGACTTAGGATCCGATCT" =~ /(GATC)/g 0 'GATC' 1 'GATC' 2 'GATC' DB<270> x "GATCGGGGACTTAGGATCCGATCT" =~ /(GATCT)/g 0 'GATCT' DB<271>
> each unique substring of length >= some minimum length (I used 3 in my code) that occur more than once.
That's not solvable with a trivial regex because of the overlaps°, I suppose tybalt's complex solution with forced backtracking and embedded code for temporary results already nailed it.
But I'm pretty sure we had this question here in the past. Maybe try super search
Also seems identifying repeated sequences be a standard in BioInf, so some libraries should offer this.
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
°) (AAA_[BBB_)CCC]_(AAA_BBB_)[BBB_CCC] brackets ( and [ for different repeated but overlapping sequences.
|
|---|