When you say "the UTF-8 CSV is fine" you mean, I assume, that you can open the CSV file in a text editor and the characters display correctly.
The CSV file from xls2csv looks fine, whether I just "cat" it in a terminal window, open it in a text editor or open it in Excel. The CSV from my script comes out with the wrong encodings in all three situations. The reference to opening the output in Excel, in reply to poj's sample script, was only to see if another program (besides 'cat' or a text editor) could make sense of the data.
The source data is an Excel workbook or, more precisely, a Microsoft Excel Open XML Format Spreadsheet (XLSX) file created by a third-party website. I doubt an actual instance of Excel is being used since the file is dynamically created by a database query. After I process it, my output CSV is going to be uploaded to a different third-party website. There is no Microsoft Excel processing the output. The goal is to eliminate humans opening the spreadsheet in Excel and manipulating it at all.
Aside: I unzip'ed the .xlsx file and looked at the raw XML. Everything in the source file is listed as UTF-8 encoding. So, I'm starting with UTF-8 and trying to end with UTF-8. (Or utf8 -- I tried both for my output as there is a difference.)
In reply to Re^4: XLSX to CSV with high ASCII characters
by apu
in thread XLSX to CSV with high ASCII characters
by apu
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |