I have not been able to make a regex that matches a discontinuous string and still follows the numeric range set in {}. See below, the first example gives correct behavior only when the string is continuous and the second demonstrates its flaws when the string is interrupted by dashes. Example one:
$seq = 'ATCGGATCTGGC'; $tag = '___'; $seq =~ s/[ATGC]{2}/$&$tag/; $seq =~ s/$tag[ATGC]{4}/$&$tag/;
printing $seq will output:
AT___CGGA___TCTGGC
And that’s exactly what they should do, BUT, if $seq has dashes, the regexes are not appropriate. In example 2 below, say that all is the same as above, except that
$seq = 'A-C-G--CTGGC';
Printing $seq now outputs:
A-C-G--CT___GGC
instead of the desired output where it effectively ignores dashes and only count letters:
A-C___G--CTG___GC
Any ideas on how I can write the regexes to match when gaps are included?
In reply to Regex to match range of characters broken by dashes by Q.and
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |