Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I would like to be able to test a string to see if it contains non-ascii characters.
At the moment I'm using a negated character class of the hex range as below. Are there any better ways of doing the test?
sub ascii_test{ my $string = shift; if ($string =~ /[^\x00-\x7f]/){ return "[$string] contains non ascii characters"; } else{ return "[$string] is all ascii"; } }

Replies are listed 'Best First'.
Re: How do I determine if a string contains non ascii characters
by japhy (Canon) on Jul 13, 2001 at 15:46 UTC
      Thanks. I couldn't think of any other sensible way of doing it but I wanted to make sure I wasn't missing anything obvious.
      It would have been nice if there was already a pre-defined character class for this though.(since \A and \a are already taken so I suppose \I and \i?)
        There is a POSIX macro for this: [:ascii:]. That is NOT a character class in and of itself -- it must be used IN a character class:
        if ($string =~ /[^[:ascii:]]/) { # a non-ASCII character was found! }


        japhy -- Perl and Regex Hacker

      Agreed. Though things that have problems with "8-bit characters" often also have problems with nul bytes and some control characters so you might consider /[^ -~\s]/ which leaves out nul, "\x7f", and all of the control characters except for the usual "\t\f\r\n". Just depends on what you need. (:

              - tye (but my friends call me "Tye")