Hi! Monks,
I am trying to select lines that contain certain length of DNA string. Following code will print any DNA string that is 8 nucleotide or longer. However, I want to print DNA strings that are exactly 8 nucleotide long, such as "ATGATGAC". I thought {8} will match exactly 8 characters, but looks like I am wrong! I also tried ATGC{8,8}; did not work either.
In addition, in a separate program, I want to select DNA strings that are between 8-21 nucleotide long. Can you please give me any suggestions?
Thank you.
PS. I was able to solve this problem using "length" function without using any regex, but I would like to learn the regex solution to this problem.

#!/usr/bin/perl use warnings; use strict; while (<DATA>){ my $string = $_; if ($string =~ /[ATGC]{8}/) { print "$string"; } } __DATA__ @3009W:27:32 GCTCT + %.8:9 @3009W:27:40 TTGGG + 0(*2+ @3009W:31:26 AGCCT + 5<=46 @3009W:31:35 TCAGAAAACTG + 0.5*.--%-0- @3009W:32:34 GGGCCTAACCTGGGAGCCCCT + A@.:158+,--*-%-**--%- @3009W:34:32 CCATCATCTGGGG + :-:>>;;55755& @3009W:36:21 GACTT + (8.7( @3009W:40:24 ATGATCC + 44.0,.% @3009W:42:22 GCTTCCAGGGTCAGTTTGGGAAAC + :@>4;4888)1//**-%+5+25,. @3009W:47:23 GAGCATCGA + %*1.0...- @3009W:49:23 GAGTTCCATCGAAATGTACAAGCTTTACGTTTAAAAC + /3....0304036-22.,--(*.09*00,11),00(. @3009W:14:90 AGCAA + 82528 @3009W:17:84 GAAACACAC + 05?4=:<:0 @3009W:17:95 TTTTTCTTT + ;<<<-07<1 @3009W:19:89 CCTCTACC + ?:>>:;83 @3009W:19:90 AAGAA + :4<;2 @3009W:20:74 GGTTCC + 2&-.2. @3009W:22:94 CATTTGGAA + AAAB9>8>: @3009W:23:79 CTTACAA + @@9@@@@ @3009W:23:93 TCTTTTTC + @@@AAA/A @3009W:24:80 GTGAGC + <AAA@@ @3009W:25:79 AATAT + ?8=.0 @3009W:26:89 AGGCA + BB>BC @3009W:26:99 ATCCATAT + /88(3979 @3009W:27:83 AGGCA + AA>@@

In reply to Why doesn't quantifier work with character classes? by rnaeye

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.