kprasanna_79 has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Monks,
Is there any way to change non-iso character to ascii character. PLease give me some ideas on it..
--Prasanna.K

Replies are listed 'Best First'.
Re: non-iso character to ascii character
by borisz (Canon) on Jul 08, 2005 at 11:36 UTC
    You can use the Encode module to convert chars between charsets. It is also possible to replace chars, that are represented in one charset, but not in another. With Encode::encode or Encode::from_to. Another idea is to remove unknown chars with
    s/[\x80-\xff]//g; # or y/\x80-\xff//d; # or Encode::from_to($string, 'utf-8', 'us-ascii', 0 );
    or you may translate the chars by yourself ie:
    s/ö/oe/g;
    Boris
Re: non-iso character to ascii character
by anonymized user 468275 (Curate) on Jul 08, 2005 at 11:48 UTC
    The set of "non-iso" character sets is theoretically infinite, therefore as literally posed, the requirement isn't computable.

    However, Convert modules do exist for converting specific non-iso character sets into ASCII, for example:

    Convert::IBM390 for EBCDIC to ASCII

    Convert::Cyr switches the code page between any pair of the following character sets

    - koi8-r

    - windows-1251

    - iso8859-5

    - x-cp866

    - cp866

    - x-mac-cyrillic.

    One world, one people

Re: non-iso character to ascii character
by jhourcle (Prior) on Jul 08, 2005 at 11:31 UTC

    You'll need to be more specific with what you're trying to do.

    Are you trying to encode the character so that it can be transported in ASCII, and then recovered? If so, you might take a look at techniques that HTTP (URI encoding) and HTML (Entity encoding) use, or the many versions of mail encodings (quoted printable, etc.)

    Are you trying to approximate the character? Ie, 'é' -> 'e'. (this one's a bit tricker...I'm guessing someone has a map out there to give you a starting point, I just don't know where...but there's probably a module in CPAN to do it). Of course, this only really works well for some languages (what to you do with Chinese characters?)