Maybe
Help needed understanding unicode in perl would help with your Perl 5.10 code. It's too bad you couldn't move up to at least Perl 5.12 for better unicode handling. See
OSCON Perl Unicode Slides and check out the slideshow for
Unicode::Tussle. Although they require Perl 5.12, the scripts may yield some clues on how to handle things.
# yikes!!!
use v5.12; # minimal for unicode_strings feature
use v5.14; # optimal for unicode_strings feature
Maybe you could install Perl 5.14 in a home directory and use it to process the files?