in reply to reading Unicode from Excel

Actually it's quite easy to get UTF-8 through Win32::OLE out of Excel. All you have to do is tell Excel (or OLE, I'm not sure) that you want UTF-8. Let me pull out some old project code...
use Win32::OLE 'CP_UTF8'; Win32::OLE->Option(CP => CP_UTF8);
With that, all data you get out of Excel should be in UTF-8.

Replies are listed 'Best First'.
Re: Re: reading Unicode from Excel
by nagilum (Initiate) on Oct 08, 2002 at 10:54 UTC
    Thanks! That was exactly it!
    Where can I find stuff like that, it's not in the docs I've read, that's for sure :-{ .
      Hi i tried with this code but still ????? comes can you help; karthick@dgbmicro.com use OLE; use Win32::OLE 'CP_UTF8'; use utf8; Win32::OLE->Option(CP => CP_UTF8); $xlfile ="c:\\Akruti Tamil Unicode Test Vectors.xls"; ##### OLE - Excel Connection # Create OLE object - Excel Application Pointer $xl_app = CreateObject OLE 'Excel.Application' || die $!; # Set Application Visibility # 0 = Not Visible # 1 = Visible $xl_app->{'Visible'} = 0; # Open Excel File $workbook = $xl_app->Workbooks->Open($xlfile); # setup active worksheet $worksheet = $workbook->Worksheets(1); #///////////////////////////////////////////opened xls file for reading. # retrieve value from worksheet print $worksheet->Range("B2")->{'Value'};
        Hi,

        It's "easier" to parse unicode excel files with this decoding via perl 5.8.X or higher.

        open(FH, "<:raw:encoding(UTF16-LE):utf8", $File_UTF_MS);

        With this line you order excel to save files in UTF16-LE.
        $Excel->{DisplayAlerts} = 'False'; $xlWorkBook->SaveAs( $File_UTF_MS, $xlConst->{'xlUnicodeText'});

        With this line you write in UTF8.
        open(FH, ">utf8", $File_UTF8);

        This because our dear MS friends make use of UTF-16.
        Cf. http://blogs.msdn.com/brettsh/archive/2006/06/07/620986.aspx
        Therefore you need to convert from raw to utf16 to utf8.
        Kind regards.
        ddn123456