Let's say I want to process some text whose encoding is uncertain, except that it is probably text, and probably in a Western (1 byte character) language. I want to do some text processing on it such as, extract all words from it. Before doing anything, I want to use
Encode::from_to($line,"$probable_encoding",''iso-8859-1'')
to put everything into iso-8859-1 in (probable) good form.
Is there anything I can use that will give me the "probable encoding" for a file / string / whatever?
I was led in this direction by the venerable Thundergnat's answer to my
matching german characters output from system call.
where he suggested I run Encode::from_to($latinresult, 'cp437', 'iso-8859-1'); before matching the output of a system call on my german WinXP box. But how did he know to use 'cp437'?
UPDATE: Thanks monks, Encode::Guess looks good. I'm going to go try it out.
In reply to What encoding am I (probably) using? by tphyahoo
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |