in reply to Is utf8, ascii ?

From the core utf8 module, you can use:

utf8::valid($string)

But presumably, you don't want to just discard data, Instead, you want to convert it to UTF8 and insert it safely. If you know what character set it is in, then use Encode to convert it. Otherwise, as you have done, you can use Encode::Guess to try to figure out what character set it is first.

Clint

Replies are listed 'Best First'.
Re^2: Is utf8, ascii ?
by rootcho (Pilgrim) on Aug 07, 2007 at 19:38 UTC
    I see.
    I'm new to these encode stuff, but now I understand... check, guess try to encode, if not discard.
    At the moment I want just to discard, later when I have time will do more tests
    But my next question was... if I check for valid utf8 string and discard. Will this discard the string if it is ascii ?
      No. U+0000 to U+007F (the first 128 Unicode characters) are represented in UTF8 by one byte - the same byte that is used in ASCII. So ASCII (7 bit ASCII, not eg ISO-8859-* or WINDOWS-1252) is a subset of UTF8.