in reply to How do I keep anything other than alphanumeric out of a variable?

The following gets rid of non-alphanumerics and underscores:

$user_name =~ s/[\W_]//g;

The pattern [\W_] breaks down as follows:

However, instead of just silently cleaning data, I'd prefer to check the string for undesirable characters and notify the user if it is bad, so that they can fix it:

$user_name =~ /[\W_]/ and warn "user name is bad\n"

Replies are listed 'Best First'.
Re: Answer: How do I keep anything other than Alpha/Numeric data out of a variable?
by davido (Cardinal) on Aug 26, 2003 at 19:01 UTC
    One caviet here: POSIX.

    POSIX can, on some systems, alter the definition of \W so tht its conventional meaning, "[^a-zA-Z0-9_]", is not exactly what you expect it to be.

    According to Friedl (the Owls book "Mastering Regular Expressions", 1st edition, pp. 65-66 and 257) (paraphrasing...):

    • POSIX can alter the meaning of \w and \W to include what other languages consider to be word characters.
    • "Locales can influence many tools that do not aspire to POSIX compliance, sometimes without their knowledge! ... If the non-POSIX utility is compiled on a system with a POSIX-compliant C library, some support can be bestowed, although the exact amount can be hit or miss. For example, the tool's author might have used the C library functions for capitalization issues, but not for \w support."
    • It is sometimes necessary to use [a-zA-Z0-9_] rather than /w. According to Friedl: "...a friend ran into a problem in which his version of Perl treated certain non ASCII bytes as [accented characters]..."

    Therefore, it is in some cases advisable to use the following construction to accomplish the task described in the subject line of this thread:

    $user_name =~ s/[^a-zA-Z0-9]//g;

    Or with case insensitivity:

    $user_name =~ s/[^a-z0-9]//gi;

    Of course this solution more accurately answers the question: "How do I purge anything other than Alpha/Numeric data from a variable?"

    Dave

    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein