>perl -wMstrict -le
"my $seq = 'xxxTATAATGyyyTatAtAtzzz';
my $box = qr{ (?i) TATA [TA] [TA] [TAG] }xms;
;;
my @TATAs;
while ($seq =~ m{ ($box) }xmsg) {
push @TATAs, [ $1, $-[1], ];
}
;;
print qq{matched '$_->[0]' at pos $_->[1]} for @TATAs;
"
matched 'TATAATG' at pos 3
matched 'TatAtAt' at pos 13
Updates:
- Note: The character class definition [T|A] in the OP is probably not what you want. The | (pipe) character has no special meaning in a character class; it just represents a literal | character, so the character class [T|A] matches a single one of any of three characters: 'T', 'A' or '|'.
- Changed example code to use case-insensitive matching per OPed example. Note that in matching long strings, case-insensitive matching may impose a significant performance penalty; it may be better to convert all strings to a common case, then match without case sensitivity.