in reply to Non Keyboard Characters to HTML entity

Hi vallavan_sathish, I must admit I'm not familiar with Word's data format, are you looking at the characters after they have been extracted from the file?

How are they represented @ the moment, Unicode ?

Are you familiar with the Encode module, it allows you to define a mapping for characters.

update: to check on the available encodings use the following one liner (adjust quotes for windows)

perl -e 'use Encode;@list = Encode->encodings(":all"); for $schema (@l +ist){print "$schema\n";}'