in reply to UTF-8 Decoding, Wide Characters, and XML::Twig

If not, how do you convert from UTF-8 to ASCII?

The last time I had to do this I was feeding XML data to a Glimpse search indexer. Glimpse doesn't support UTF-8 (or at least, it didn't then) but I needed it to answer queries in UTF-8, so just dropping the non-ASCII characters or replacing them with ? wasn't an option. To solve the problem I converted UTF-8 to UTF-7, a Uniode encoding that uses only 7-bit ASCII characters. Then I did the same thing with the search strings before sending them to Glimpse for matching. It worked great for me, perhaps you can do the same thing.

-sam

  • Comment on Re: UTF-8 Decoding, Wide Characters, and XML::Twig