Re: UTF-8 Decoding, Wide Characters, and XML::Twig

I use Text::Unidecode for this, but I should add that I don't care about fixed width. It's a great module for readability (since it translates e.g. "é" to "e", which I found preferable over replacing every high char with a ?), but I believe it translates e.g. Japanese characters (1 utf char) to for instance "wa"¹. This might screw up your alignment.

I know unicode defines several different kinds of spaces (non-breaking or breaking, zero-width, half-width, em- and en- spaces just to name some off the top of my head). It's entirely possible that the utf8 2 ascii translation misses one of these.
Depending on the input, you might try a simple regexp like s/\s/ /g before translating to ascii, although I don't know exactly which unicode whitespace characters are defined within \s.
_______________________
¹ I'm not at all familiar with Japanese. Just mentioned it for illustrative purposes. The Text::Unidecode pod has more detailed (and more accurate!) examples.

Comment on Re: UTF-8 Decoding, Wide Characters, and XML::Twig Download Code