Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

regular expressions - finding repeats

by glwtta (Hermit)
on Feb 26, 2003 at 17:44 UTC ( [id://238858]=perlquestion: print w/replies, xml ) Need Help??

glwtta has asked for the wisdom of the Perl Monks concerning the following question:

Ok, I've come across this problem several times, figured it was time to ask. If I have a string, how do I match any of several character repeating a certain amount of time? Basically, /[ABCD]{5}/ will match either A, B, C or D five times, I need it to match A, B, C or D and then match that character 5 times. I first thought something like /([ABCD]){5}/ should do the trick, but that doesn't seem to work. Also I need to find repeats of more than one character, which in my mind would look like /([ABCD]{2}){5}/

Any thoughts?

Replies are listed 'Best First'.
Re: regular expressions - finding repeats
by blokhead (Monsignor) on Feb 26, 2003 at 17:55 UTC
         /([ABCD])\1{4}/

    Match (and capture with parens) the first character, then match consecutive repeats of whatever character you captured. This is called a backreference, and used to match repeated things. Check out perlre and search for "backreference" for more info.

    Update: A backref can also match more than one character. You can match the strings "ABCABC" "ABBABB", "CBCB", "DD" with the following regex:

         /([ABCD]+)\1/

    blokhead

Re: regular expressions - finding repeats
by BronzeWing (Monk) on Feb 26, 2003 at 21:03 UTC

    Well blokhead already posted what's probably the best answer, and the one that I wanted to give, but I've got to post something. So here:

    if ($String =~ join('|',map($_ x 5, qw(A B C D E)))) {}

    Admittedly it takes 3-4 times as long according to benchmark... but it was so much more fun to write ;p.

    -BronzeWing

      Here's a little attempt to optimize this code. (Not that it's worth optimizing like this since backreferences rule, but anyway... :)) The pattern can be extended to the line below, and as a bonus you get an even uglier expression.   '(?=[A-E])(?:' . join('|',map($_ x 5, qw(A B C D E))) . ')' A side note that actually can be relevant: if you don't rebuild the pattern each time you'll save a lot. And if you use qr// or /o you'll be even happier, usually.

      ihb
Re: regular expressions - finding repeats
by rir (Vicar) on Feb 27, 2003 at 00:25 UTC
    You may wish to avoid back references.
    /(A{5}|B{5}|C{5}|D{5})/
    Or for your other case:

    /(A{2,5}|B{2,5}|C{2,5}|D{2,4})/

    /(A{2,5}|B{2,5}|C{2,5}|D{2,5})/

    Update: fixed typo per Nkuvu

      Why the D{2,4} on the second? Typo?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://238858]
Approved by jasonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-03-29 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found