in reply to Re: A tidier regex ?
in thread A tidier regex ?

Since this is checking user input in the form of a prompt (rather than input from a file) the suggestions you have made would be sufficient. However in your suggestion whether this would cover it
m[CDC\w+(?:,\s+\w+){2}]
How is the DDC.... catered for ?. My regex skills are at the beginner level hence my question. Thanks.

Replies are listed 'Best First'.
Re^3: A tidier regex ?
by BrowserUk (Patriarch) on Sep 14, 2010 at 13:50 UTC
    How is the DDC.... catered for ?.

    It's allowed for by the \w+, which matches [A-Za-z0-9_], but it is not verified. So, for example it would also match 'CDC_1, ABC, ABC', if there was any possibility of that appearing in your data.

    And that is where you will have to apply your knowledge of your data to decide just how specific you have to be to ensure you only match that data you want to match.

    You might for instance know that there will be lines similar to CDC_..., ABC..., XYZ... that you mustn't match, in which case, you need to be more specific. Maybe m[CDC\w+(?:,\s+DDC\w+){2}] would satisfy.

    But, if the data is coming from a users typing--who are apt to transpose and omit stuff--then maybe you should stick with a fully specified regex. Say

    m[CDC(?:_[A-Z0-9]+){2}(?:,\s+DDC[SR]MR[A-Z0-9]+){2}]

    Only you can know your full requirements.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.