Just another Perl shrine | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Hi, I would like to define a regular expression that matches strings consisting of two sub-strings separated by a single comma. Each sub-string may not be empty, and consists exclusively and without repetition of the characters 'A', 'G', 'C' and 'T'. Thus, the pattern should match strings such as:
A,G and should not match:
,G <- missing 1st sub-string So far I have: /^(?=[ACGT]{1,4},[ACGT]{1,4}$)(?!.*(.).*\1.*,)(?!,.*(.).*\1).*$/ /^(?=[ACGT]{1,4},[ACGT]{1,4}$)(?!.*(.).*\g{1}.*,)(?!,.*(.).*\g{1}).*$/ also tried joining the capture groups with: /^(?=[ACGT]{1,4},[ACGT]{1,4}$)(?!.*(.).*\g{1}.*,.*(.).*\g{2}).*$/Now, (?=[ACGT]{1,4},[ACGT]{1,4}$) seems to match the "two sub-strings separated by a single comma" and "consists exclusively of the characters 'A', 'G', 'C' and 'T'" through out the string; (?!.*(.).*\1.*,)seems to match "without repetition" up to the comma. However, (?!,.*(.).*\1)appears not to be ensuring that it doesn't match a repeated character after the comma. I'd greatly appreciate replies with clues and/or patterns that help with the desired matching. Using perl v5.18.2 Thanks in advance Robert In reply to Regular expression for a comma separated string by naderra
|
|