in reply to Regex question

What this regular expression will match is:
/( (\w)+ #any char [A-Za-z0-9_] 2 #followed by a 2 ){2,} /x; #this series at least two times.
it would fail in all the cases mentioend so far because there are no 2 characters in any of them. What it would match is something like.
aa2cdafdbb2 bb2bb2 a2a2
each of which capture the pattern you mentioned above. I think what you meant to put down, or I would guess that the perl book you refered to had as the regular expression is more along the lines of:
/((\w)\2){2,}/
which does what you mention (sort of depending on what you mean by a repeated string). And how it works is that the character captured by the second set of parenthesis ie., \w is placed in \2 so it has to match 2 of the same characters at this point. The {2,} on the outside means that the pattern (not the exact match but the pattern of "a character in class \w followed by the exact same character"), to its left has to happen at least two times in a row. So aa would not match nor would aabcc but thins like aabb would, just as well as abc33444d. If you need more clarification feel free to ask

update: you can add use re 'debug'; to the top of your script to get a better idea of how the regex engine is working. and as always there is good ol' perlre

update2:perlreftut's section on Matching Repititions is another good reference.

-enlil

Replies are listed 'Best First'.
Re: Re: Regex question
by Anonymous Monk on Jan 27, 2003 at 20:24 UTC
    Thanks alot, Enlil, I see the point now.