in reply to Characters in disguise

Sounds tough. You might try a genetic programming approach, breeding programs which apply transformations until one can reproduce the original STRING to NORMALIZED_STRING mapping. That's a lot easier said than done though, and it would be extra tough to do it with Perl. A code-is-data language like Scheme is a more natural fit.

You might also look at how String::Approx works. It solves a similar problem, although I don't think you'll be able to use it directly.

-sam