Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I know how to check if a string is only numbers, but how can I detect IF the string contains anything besides
[a-zA-Z0-9]
So in other words if the string is:$string = "hello world"; then the check would be ok, but if it had:$string = "hello world!"; Then it would not because it had a ! in it. But I don't want to check just for ! or other characters, I want to make sure it ONLY has a-z, A-Z and 0 to 9 and that is it.

sort of like this:
my $string = "I am not a window washer."; if($string =~ ?what pattern would I search for?) { # Oops has invalid character } else { # String is perfect only alphanumeric }
That is what I cannot find. I have searched everywhere, for everything I can think of, like 'alphanumeric only' 'alphanumeric' 'regex alphanumeric' but I keep coming up empty handed.

Can you please point me in the right direction?

Thank you!!

Replies are listed 'Best First'.
Re: check if string contains anything other than alphanumeric
by moritz (Cardinal) on Aug 12, 2007 at 07:45 UTC
    You can negate the character class:

    if ($string =~ m/[^a-zA-Z0-9]/){ print "The string contains non-alphanumeric characters"; }
Re: check if string contains anything other than alphanumeric
by syphilis (Archbishop) on Aug 12, 2007 at 08:10 UTC
    I want to make sure it ONLY has a-z, A-Z and 0 to 9

    Your description is at odds with the example you gave. The string "hello world" contains a space - which is not an alphanumeric character, and not part of the character class [a-zA-Z0-9]

    Cheers,
    Rob
      Yup, ur right. lol, sorry. I was typing very fast and did not even realize it. this did work though:
      if ($value =~ m/[^a-zA-Z0-9]/) {
      So thank you very much!!!

      I do have a question though, why does that work, since it is not !~ and is =~?

      Oh well, it does work and that is all that matters!

      Thanks again.
        If a character class starts with ^, it is a negated character class. That is, it will match any character that is not in the class. So, /[^a-zA-Z0-9]/ will match any character that is not a letter or digit.

        See perlretut for a tutorial on regexps and perlre for the details.

        update: My 200th node :)

        I do have a question though, why does that work, since it is not !~ and is =~?
        It reads more or less "it's true if the string contains (=~) something other (^) than letters and numbers (a-zA-Z0-9)", so it works.

        You can of course use !~ as well, to say "it's true if the string doesn't contains (!~) one or more letters and numbers ([a-zA-Z0-9]+) in its entirety (hence the ^ and $ that constitutes the start and end marks).

        print "bad string: $string\n" if $string !~ /^[a-zA-Z0-9]+$/;
        which I prefer to write (yes back to =~ and not !~):
        print "bad string: $string\n" unless $string =~ /^[a-zA-Z0-9]+$/;

        Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: check if string contains anything other than alphanumeric
by naikonta (Curate) on Aug 12, 2007 at 08:36 UTC
    Yes, perlre is the right direction :-) You can use character class where you can group some characters based on certain specification, and in this case you can make arbtrary spec. Perl provides some symbols for predefined character class, such as \w for alphabet and numeric and underscore ("_"), \d for digits only, and \s for all kind of recognized spaces.

    From your description I think you need the \w character class,

    $ perl -le 'print "hello world" =~ /^\w+$/ ? "OK" : "BAD"' BAD ---> contain spaces $ perl -le 'print "helloworld" =~ /^\w+$/ ? "OK" : "BAD"' OK $ perl -le 'print "hello_world" =~ /^\w+$/ ? "OK" : "BAD"' OK ---> underscore is considered alphanumeric character.
    If you want to exclude the underscore as well then you have to say the character class explicitly,
    $ perl -le 'print "hello_world" =~ /^[a-zA-Z0-9]+$/ ? "OK" : "BAD"' BAD ---> underscore is not in the class
    Since version 5.6.0, Perl supports [:class:] style of character class, and the equivalet for above is [:alnum:].
    $ perl -le 'print "hello world" =~ /^[[:alnum:]]+$/ ? "OK" : "BAD"' BAD ---> space is not alnum $ perl -le 'print "hello_world" =~ /^[[:alnum:]]+$/ ? "OK" : "BAD"' BAD ---> _ is not alnum $ perl -le 'print "HelloWorld1" =~ /^[[:alnum:]]+$/ ? "OK" : "BAD"' OK --> nothing there but letters and digits
    HTH,

    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

      $ perl -le 'print "hello_world" =~ /^\w+$/ ? "OK" : "BAD"' OK ---> underscore is considered alphanumeric character.
      Underscore is not considered an alphanumeric character. It is considered a word character (hence the \w character class.)
        It's said that \w matches a 'word' character (alphanumeric plus '_'), so I should have said that "underscore is considered part of that character class". Thank you, jwkrahn.

        Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: check if string contains anything other than alphanumeric
by jwkrahn (Abbot) on Aug 12, 2007 at 08:31 UTC
Re: check if string contains anything other than alphanumeric
by Your Mother (Archbishop) on Aug 13, 2007 at 06:19 UTC

    (Update: whoops! jwkrahn did mention it below.)

    I don't think anyone has mentioned the POSIX character class syntax: alnum. Those and more are in the perlre. They don't work in older versions of Perl, IIRC.

    # match (all are alphanumeric) /\A[[:alnum:]]+\z/ # doesn't match (any single is not alphanumeric) /[^[:alnum:]]/