dani_cv_perl has asked for the wisdom of the Perl Monks concerning the following question:

Can you please provide me a regular expression help. I am new bie to the PERL see. I have the following requirements. (1) The string can contain only letters/digits or '-' symbol. (2) no special characters allowed nor space (3) String should not start with '-' also.

Replies are listed 'Best First'.
Re: Regular Expression - Need HElp..pls
by davido (Cardinal) on Jan 31, 2006 at 17:14 UTC

    Something like this then?

    m/ \A [[:alpha:][:digit:]] [[:alpha:][:digit:]-]* \z /x

    Dave

      Shouldn't that be

      m/ \A (?: [[:alpha:][:digit:]] [[:alpha:][:digit:]-]* )? \z /x

      Your version requires that at least one character be present.

        You could be right. But since the OP doesn't specify whether or not the string must contain at least one character, I'd say that there is sufficient ambiguity in the definition of the problem that there's about an equal chance that mine is right. Good observation. Now the OP can choose which meets his needs.


        Dave

Re: Regular Expression - Need HElp..pls
by GhodMode (Pilgrim) on Jan 31, 2006 at 17:14 UTC
    /^[\w\d][\w\d-]*$/
    • beginning of the string
    • followed by a single word or digit character
    • followed by 0 or more word, digit, or dash characters
    • followed by the end of the string
    --
    -- GhodMode
    

      '\w' allows the non-alpha character '_' (which is considered by Perl to be a "word" character). It also allows for a trailing '\n' in the string (the \n isn't "matched", but it's permitted by '$'). Since I understood the OP's criteria to not permit any sort of space or special character, the solution is to use '\z' instead of '$'.

      Update: As pointed out by Roy Johnson, \w also matches numeric digits, so the use of \d is redundant.


      Dave

        Dave's exactly right! So, I'll recommend and explain something which combines our methods ...

        /^[:alpha:][[:alpha:]\d-]*\z/

        I favor one-liners, but that's just a style preference. I changed my \w to [:alpha:], my ^ to \A, and my $ to \z per davido's recommendation, but I stuck with the \d because it's shorter and still works in this case. I left off the m at the beginning because that's the default anyway. I left off the x at the end because this regex doesn't use extended patterns (ref: Extended Patterns).

        • \A : Beginning of the string
        • [:alpha:] : One alphabetic character (ref: perlre
        • [[:alpha:]\d-]* : 0 or more characters consisting of alphabetic characters, digits, or dashes
        • \z : The end of the string
        --
        -- GhodMode
        
      Thanks.. a lot....