anlamarama has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
I have lots of text files and am inserting them into a database. The textfiles' encoding is different from each other. Some of them are UTF8, some iso-8859-9 and others cp1254. I am loading text file into a variable (this is the only way currently), and updating a row in the database. If it is UTF8, it should be inserted without changing the encoding. If it is cp1254 or iso-8859-9 , then I need to decode the data first. However, I have no idea what the encoding is. Is there any way to determine the encoding? I will update 150.000 rows, so I would like to reduce the potential errors as much as I can.
I tried Encode::Guess,
my $decoder = guess_encoding($data, qw/iso-8859-9 cp1254/);
It says: "iso-8859-9 or cp1254", but the correct encoding is cp1254. So, it is also not useful.
What do you suggest? Are there any workarounds or solution for this?
Thanks in advance,
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Encoding Problem
by graff (Chancellor) on Nov 13, 2009 at 02:59 UTC | |
by anlamarama (Acolyte) on Nov 13, 2009 at 03:57 UTC | |
by graff (Chancellor) on Nov 14, 2009 at 18:08 UTC | |
|
Re: Encoding Problem
by anlamarama (Acolyte) on Nov 13, 2009 at 05:58 UTC |