Re: Regular expression matching when it shouldn't

Your match isnt anchored.

There are 3 character classes so on someRandomName- it could successfully match like so. The first character class matches 's' the second matches 'o' and the third matches 'm' so you have a successful match

Since RE's are greedy whats really happening is that the first matches the whole thing except the - the second set matches the - the third fails. Then the RE backtracks and the first matches everything but 'e-' and the second matches e- and the third fails.

Then the first matches up to 'ame-' the the second matches 'ame-' the 3rd fails the second backtracks until it matches 'm' then the 3rd matches 'e' and SUCESS!

If you want to match at end of string you need to add a $ to the end or if you want to match words (my take on the question) you need to add \b to both ends like so:

if ($domain =~ /[A-Za-z0-9]+[A-Za-z0-9-]+[A-Za-z0-9]$/) #matches at en
+d of text
if ($domain =~ /\b[A-Za-z0-9]+[A-Za-z0-9-]+[A-Za-z0-9]\b/) #matches on
+ly words seperate from non words.
[download]

Also I got rid of the /i which should be faster but as has been noted above you should use the \w type escapes where possible instead of the spelled out classes.

Your current code will also match my-do^%$%^@#@ which you don't seem to intend the \b will fix that since the word in question has stuff not in the classes and so it is a fail. In fact your RE will match any 3 legal (based on your RE) sequence in any text.

Hope this helps.

Comment on Re: Regular expression matching when it shouldn't Download Code