in reply to RE: Eight bit character (non-ASCII) conversion
in thread Eight bit character (non-ASCII) conversion

I think I have a slightly different problem than what these modules address -- moreover, these modules require a local iconv implementation, which Win2k does not (to my knowledge).

Even if I have an implementation, I need to have the conversion table installed, which is what I was grabbing and parsing from unicode.org -- given that I had already parsed the file I might just as well build the tr/// strings at the same time.

Finally, I couldn't find any documentation indicating which conversion tables (on a Solaris installation) corresponded to what -- nothing that looked like a Mac table or a Windows code page 1252 table, which I think is standard 8859-1 but I'm not positive. The conversion tables are in some binary format, too, so I can't just inspect them.

Do you have more info you could share?

  • Comment on RE: RE: Eight bit character (non-ASCII) conversion

Replies are listed 'Best First'.
RE: RE: RE: Eight bit character (non-ASCII) conversion
by lhoward (Vicar) on Nov 15, 2000 at 17:54 UTC
    I know that you can download a free version of libiconv from libiconv. I don't know if it will build on Windows or not, but it does support a couple of character sets that sound like what you're looking for. With most versions of the iconv library there is an iconv command line program that will do conversions. On mine "iconv --list" will give me a list of all the character sets it supports.

    You may also want to look at check out some of the unicode conversion modules:

    • Unicode::Map
    • Unicode::Map8
    • Unicode::MapUTF8
    I believe that all of them run without needing any external (non-CPAN supplied) libraries. Since those modules are designed to do map to/from unicode you could solve your problem by doing a 2-step conversion: MAC->UNICODE->WINDOWS1252
      Unicode::Map8 provides an 8-bit to 8-bit method, and is extremely comprehensive in its coverage of character maps. You have to poke through things to figure out which code pages to use, but that's a problem I faced anyway.

      Had I known it existed I would likely have used it.

      Instead I learned some new things about perl :)

      Update:
      It turns out for portability issues that the Unicode::Map8 module is really the way to go.

      Thanks for the tip!