Re^3: Parsing a .xlsx file with chinese characters

I gave this a test with the exact code you gave on my machine, and it worked great!

$ perl test.pl
Sheet: Sheet1
( 0 , 0 ) => this
( 1 , 1 ) => is
( 2 , 2 ) => a
( 3 , 3 ) => test
( 4 , 1 ) => &#20160;&#40636;
Sheet: Sheet2
Sheet: Sheet3
$
[download]

(PM may convert the text (traditional Chinese "shenme" -- what) into an entity here, but it definitely worked in my xterm)

It may be that whatever you're using to view the file isn't expecting UTF-8; or, perhaps the encoding in the XLSX itself isn't UTF-8 (but I'm not sure if that's an option in XLSX files or what!).

Comment on Re^3: Parsing a .xlsx file with chinese characters Download Code

Replies are listed 'Best First'.
Re^4: Parsing a .xlsx file with chinese characters by Sithiris (Novice) on Oct 03, 2011 at 21:32 UTC
thanks for trying it. I'm guessing from a quick google search of xterm you are running the script in a non-Windows environment? Is it possible this would have an effect on it's success? I'm guessing doubtfully considering excel is a windows based programme.	[reply]
Re^5: Parsing a .xlsx file with chinese characters by anneli (Pilgrim) on Oct 05, 2011 at 04:44 UTC
You're right; I ran it on a Linux VM. If you're running this in the Windows terminal (cmd.exe or what have you), I'm inclined to think the problem isn't with the output from Excel::Spreadsheet, but that cmd doesn't display UTF-8 properly. What if you redirect the output of the script to a .html file, then try loading it in a browser? Make sure the encoding gets detected as UTF-8. If it displays correctly, it's just the terminal, and your data is fine. :)	[reply]
Re^6: Parsing a .xlsx file with chinese characters by Sithiris (Novice) on Oct 05, 2011 at 21:17 UTC
I have said in my script to print to a UTF8 encoded text file which I opened in word and it displayed correctly just wrong characters. what I am thinking is that it may be 'deconstructing the character for example instead of "\x{2013}" it is displaying "\xE2","\x80","\x93". If this is the case would there be a way to force it?	[reply]
Re^7: Parsing a .xlsx file with chinese characters by anneli (Pilgrim) on Oct 05, 2011 at 21:55 UTC
Re^8: Parsing a .xlsx file with chinese characters by Sithiris (Novice) on Oct 06, 2011 at 20:04 UTC
Some notes below your chosen depth have not been shown here