in reply to testing if a string is ascii

Hi, Assuming your string is in $str, try this:

if ($str =~ /[^!-~\s]/g){print "Non-ASCII character found"}

This will check for any other character apart from ASCII character.

Replies are listed 'Best First'.
Re^2: testing if a string is ascii
by Skeeve (Parson) on Sep 04, 2006 at 06:29 UTC
    there are more charcters in ASCII below space than just \t, \r and \n

    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re^2: testing if a string is ascii
by earlati2 (Beadle) on Sep 04, 2006 at 07:20 UTC
    Hi, Assuming your string is in $str, try this:

    if ($str =~ /^!-~\s/g){print "Non-ASCII character found"}

    This will check for any other character apart from ASCII character.

    hi,
    can you explain better what this statement means ?
    I didn't know what ^!-~\s will go
    regards, Enzo

      Hi Enzo, Here is what the regex means:

      ^ is a exclusion operator.

      !-~ is a range which matches all characters between ! and ~. The range is set between ! and ~ because these are the first and last characters in the ASCII table (Alt+033 for ! and Alt+126 for ~ in Windows). As this range does not include whitespace, \s is separately included. \t simply represents a tab character. \s is similar to \t but the metacharacter \s is a shorthand for a whole character class that matches any whitespace character. This includes space, tab, newline and carriage return.

      The meaning of the complete statement is "If anything which is not between the ASCII range of ! and ~ and if not a whitespace, test is true."

      Sriram