Hi Monks,
I am building an application that loads data from an excel file into an Orcale database. The idea is to save the file as tab delimited (text), and simply to parse it using Perl. All the data I work with is in UTF-8.
This works fine, unless the data in the excel file uses special characters (in my case Korean chars).
For Example:
Korean Studies Information Service System/한국학술정보학술지원문데이터베이스
You can open an excel file and paste this example if you want to recreate the problem (not placed in code tags because the chars are encoded).
If I save this as a text file, the special characters all turn into '?'.
Another option is to save the file as unicode. The problem now is that the text file is encoded using utf-16 and not utf-8, and I can't load it into the DB.
I tried to convert to utf-8 using Encode, but with no success for Korean characters (although with partial success for Czech chars, so I think I might be in the right direction).
This is the code I used (test is the unicoded file, test_utf is the utf-8 encoded file):
#!usr/bin/per
use strict;
use warnings;
use Encode qw(encode decode);
open IN, "<", "test" or die;
open OUT, ">", "test_utf" or die;
while (my $line= <IN>){
##unicode = utf-16 I think
$line = decode('unicode', $line);
$line = encode('utf-8', $line);
print OUT $line;
}
close IN;
close OUT
Any idea what might work?
Also, this is isn't strictly Perl, but if anybody has an idea how to save an excel file as utf-8 without losing special chars I will be extremely grateful
Thanks,
Guy
Man is the only animal that can remain on friendly terms with the victims he intends to eat until he eats them.
- Samuel Butler