Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Arabic Encodding Problem

by moritz (Cardinal)
on Feb 09, 2014 at 12:54 UTC ( #1074109=note: print w/replies, xml ) Need Help??

in reply to Arabic Encodding Problem

There are two problems with your code. One is that you decode UTF-8 at the script/input level (with use utf8; and open with :encoding(UTF-8)), but you don't encode at the output level. A

binmode STDOUT, ':encoding(UTF-8)';

should help. read more.

The second (potential) problems is that you open all files as UTF-8, but if some of them aren't actually UTF-8 encoded, you'll get Mojibake.

Before you decode a file as UTF-8, you need to find out its character encoding. If you have no additional meta data that can help you find out the character encoding, you can look for clues inside the document, or use something like Encode::Guess to auto-detect the character encoding. (But beware that these methods are also error-prone).

Replies are listed 'Best First'.
Re^2: Arabic Encodding Problem
by malak (Initiate) on Feb 09, 2014 at 23:36 UTC

    Thank you for replying, in fact all the files I can open them in IE using utf-8, but in the same script there is nothing about the encoding. I have used Encode::Guess but still not solving the problem. On the other hand I used UTF-8 at the output layer but in this case all files will be unreadable. I appreciate any help.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1074109]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2023-02-02 14:43 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (19 votes). Check out past polls.