Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I want to match something using a regular expression between 1 and 30 times. However, using {m,n} only allows me to match a maximum of 9 times. Does anyone know how I can do this?

Here's what i've got:

# I need to match the $4 up to ~ 30 times. if ($ssh[0] =~ /^\s*>(\w+)\-(\w+)\-(\w+)\s+((\n\w+){1,9})/) { print "$1 $2 $3 \n $4\n"; } # this is what i'm matching: >MWG869551-C3277-T7 CCCATCCAGTTTATGAGATCGCGCATTGATCGCCCGAGGCGGTCTA GCCGCCACACCACTAATGTATCGCCGGCGCGCAATTGCTCTTTAATTTTCTCAAGCCCCG GCCGCGCAGTGACAACGCCAGAGACCTTATCAGTGATGACCTTTTCGCAGCCCGCCGCCC GTAAAGCGTCGGTTTGAAGATCCAAATTTTGTTCGATGGTCGAGACCCGCGCATAGCCAA TCTTCATACGAGCAGTGCCCGGCCATTGAGAAAAGGAAGAAAACTCATGTTATCCTGAAA TTCATTGCCCTAGTTTTCTTTACCGAGTAGATTTACCGGCATTCGGTAGAATTGGCAACG TGTTGGAGATCGCGTGACCATTACC

Replies are listed 'Best First'.
Re: regex issues
by grinder (Bishop) on Jan 04, 2005 at 16:25 UTC
    However, using {m,n} only allows me to match a maximum of 9 times

    Oh really? What version of Perl are you using?

    % perl -le 'print shift()=~/^ab{20,30}$/ ? "yay" : "nay"' abb nay % perl -le 'print shift()=~/^ab{20,30}$/ ? "yay" : "nay"' abbbbbbbbbbb +bbbbbbbbb yay % perl -le 'print shift()=~/^ab{20,30}$/ ? "yay" : "nay"' abbbbbbbbbbb +bbbbbbbbbbbbbbbbbbbbbb nay

    Works for me...

    - another intruder with the mooring of the heart of the Perl

Re: regex issues
by ikegami (Patriarch) on Jan 04, 2005 at 16:28 UTC

    Do you really need to specify a max? Can't you just replace {1,9} with +?

    The last \s+ needs to be \s* (or needs to be removed) for regexp to match what you provided.

    (\n\w+) should be (?:\n\w+) (speeds things up a little), or maybe even (?:\n[ACGT]+) (for conciseness).

    $ssh[0] = <<'__EOI__'; >MWG869551-C3277-T7 CCCATCCAGTTTATGAGATCGCGCATTGATCGCCCGAGGCGGTCTA GCCGCCACACCACTAATGTATCGCCGGCGCGCAATTGCTCTTTAATTTTCTCAAGCCCCG GCCGCGCAGTGACAACGCCAGAGACCTTATCAGTGATGACCTTTTCGCAGCCCGCCGCCC GTAAAGCGTCGGTTTGAAGATCCAAATTTTGTTCGATGGTCGAGACCCGCGCATAGCCAA TCTTCATACGAGCAGTGCCCGGCCATTGAGAAAAGGAAGAAAACTCATGTTATCCTGAAA TTCATTGCCCTAGTTTTCTTTACCGAGTAGATTTACCGGCATTCGGTAGAATTGGCAACG TGTTGGAGATCGCGTGACCATTACC >... __EOI__ if ($ssh[0] =~ /^\s*>(\w+)\-(\w+)\-(\w+)\s*((?:\n[ACGT]+)+)/) { print "$1 $2 $3 \n $4\n"; } __END__ output ====== MWG869551 C3277 T7 CCCATCCAGTTTATGAGATCGCGCATTGATCGCCCGAGGCGGTCTA GCCGCCACACCACTAATGTATCGCCGGCGCGCAATTGCTCTTTAATTTTCTCAAGCCCCG GCCGCGCAGTGACAACGCCAGAGACCTTATCAGTGATGACCTTTTCGCAGCCCGCCGCCC GTAAAGCGTCGGTTTGAAGATCCAAATTTTGTTCGATGGTCGAGACCCGCGCATAGCCAA TCTTCATACGAGCAGTGCCCGGCCATTGAGAAAAGGAAGAAAACTCATGTTATCCTGAAA TTCATTGCCCTAGTTTTCTTTACCGAGTAGATTTACCGGCATTCGGTAGAATTGGCAACG TGTTGGAGATCGCGTGACCATTACC