http://qs1969.pair.com?node_id=1074109


in reply to Arabic Encodding Problem

There are two problems with your code. One is that you decode UTF-8 at the script/input level (with use utf8; and open with :encoding(UTF-8)), but you don't encode at the output level. A

binmode STDOUT, ':encoding(UTF-8)';

should help. read more.

The second (potential) problems is that you open all files as UTF-8, but if some of them aren't actually UTF-8 encoded, you'll get Mojibake.

Before you decode a file as UTF-8, you need to find out its character encoding. If you have no additional meta data that can help you find out the character encoding, you can look for clues inside the document, or use something like Encode::Guess to auto-detect the character encoding. (But beware that these methods are also error-prone).