in reply to Re: Regular Expression - Need HElp..pls
in thread Regular Expression - Need HElp..pls

'\w' allows the non-alpha character '_' (which is considered by Perl to be a "word" character). It also allows for a trailing '\n' in the string (the \n isn't "matched", but it's permitted by '$'). Since I understood the OP's criteria to not permit any sort of space or special character, the solution is to use '\z' instead of '$'.

Update: As pointed out by Roy Johnson, \w also matches numeric digits, so the use of \d is redundant.


Dave

  • Comment on Re^2: Regular Expression - Need HElp..pls

Replies are listed 'Best First'.
Re^3: Regular Expression - Need HElp..pls
by GhodMode (Pilgrim) on Jan 31, 2006 at 18:55 UTC

    Dave's exactly right! So, I'll recommend and explain something which combines our methods ...

    /^[:alpha:][[:alpha:]\d-]*\z/

    I favor one-liners, but that's just a style preference. I changed my \w to [:alpha:], my ^ to \A, and my $ to \z per davido's recommendation, but I stuck with the \d because it's shorter and still works in this case. I left off the m at the beginning because that's the default anyway. I left off the x at the end because this regex doesn't use extended patterns (ref: Extended Patterns).

    • \A : Beginning of the string
    • [:alpha:] : One alphabetic character (ref: perlre
    • [[:alpha:]\d-]* : 0 or more characters consisting of alphabetic characters, digits, or dashes
    • \z : The end of the string
    --
    -- GhodMode
    

      Your solution is still wrong as proposed:

      /^[:alpha:][[:alpha:]\d-]*\z/

      The first mistake is that [:alpha:], by itself isn't a character class. It won't compile as you've written it. It needs to be presented like this:

      [[:alpha:]] ^_________^______Note the outer set of brackets.

      The next problem is that the OP stated that the first character cannot be a hyphen character, but he didn't exclude numeric digits. Your solution will fail if the first character is a numeric digit.

      Also, '^' matches at the beginning of a "line". That probably isn't an issue for this particular regexp, but it is worth noting that '^' is different from '\A', which matches at the beginning of a string.

      If you really despise the /x modifier, and prefer to avoid [:digit:] for whatever reason (maybe less typing?), you could rewrite my original solution like this:

      m/\A[[:alpha:]\d][[:alpha:]\d-]*\z/

      But I think the version with the /x modifier is easier to read since it keeps individual anchors together, and everything else separate.


      Dave