mcoblentz has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

My SaaS vendor is supplying me with an download file apparently in the format of Excel 2004 XML. I need to convert this to either a vanilla .xls format or a xlsx format (don't care which) so I can upload it into another database for analysis. I am having a devil of a time parsing this file format (and I might be wrong about what it truly is) using any of the typical Spreadsheet::<something> modules.

I have tried Spreadsheet:ParseExcel, Spreadsheet::XLSX (which is discontinued I think), Spreadsheet::Read, and Spreadsheet::ParseXLSX. All fail to open the file. I mostly get:

format error: can't find EOCD signature at /opt/local/lib/perl5/site_perl/5.22/Archive/Zip/Archive.pm line 71 +8. Archive::Zip::Archive::_findEndOfCentralDirectory(Archive::Zip::Ar +chive=HASH(0x7f883a82c478), IO::File=GLOB(0x7f883e43f2d8)) called at +/opt/local/lib/perl5/site_perl/5.22/Archive/Zip/Archive.pm line 591 Archive::Zip::Archive::readFromFileHandle(Archive::Zip::Archive=HA +SH(0x7f883a82c478), IO::File=GLOB(0x7f883e43f2d8), "/Users/coblem/Dow +nloads/KeywordExposure-EMC (Owner)-February "...) called at /opt/loca +l/lib/perl5/site_perl/5.22/Archive/Zip/Archive.pm line 559 Archive::Zip::Archive::read(Archive::Zip::Archive=HASH(0x7f883a82c +478), "/Users/coblem/Downloads/KeywordExposure-EMC (Owner)-February " +...) called at /opt/local/lib/perl5/site_perl/5.22/Spreadsheet/ParseX +LSX.pm line 56 Spreadsheet::ParseXLSX::parse(Spreadsheet::ParseXLSX=HASH(0x7f883a +805680), "/Users/coblem/Downloads/KeywordExposure-EMC (Owner)-Februar +y "...) called at excel_parsing.pl line 10 Can't open file '/Users/coblem/Downloads/KeywordExposure-EMC (Owner)-F +ebruary 2016-05312016_182543.xls' as a zip file at /opt/local/lib/per +l5/site_perl/5.22/Spreadsheet/ParseXLSX.pm line 56.

I can open the file with Excel, and do a save As. I'm not sure if perl can tell Excel to do a SaveAs from outside the program control (I suppose VB could do it but I don't want to learn VB to do this - I would not be able to maintain the script).

So. How would you attack the problem? My current approach is to open the file, slurp it in, then spit it out. But if I could get Excel to do it, I would be just as happy. I don't see how to control excel to do that, however. And the final operating environment will be windows server, although I do all my unit testing on a mac.

A search for program control of excel did not turn up anything I could use.

Your advice?

Thanks in Advance,

Replies are listed 'Best First'.
Re: Okay, how would you attack this one?
by poj (Abbot) on Jun 02, 2016 at 19:52 UTC
    #!perl use strict; use Win32::OLE; $Win32::OLE::Warn = 3; # die on errors my $file = 'c:\\temp\\Book1.xml'; my $Excel = Win32::OLE->GetActiveObject('Excel.Application') || Win32::OLE->new('Excel.Application'); my $workbook = $Excel->Workbooks->Open($file); $workbook->saveAs('c:\\temp\\Book1.xlsx',{FileFormat => 51}); #xlWorkb +ookDefault $workbook->Close(); $Excel->Quit();
    poj
      I learn something new every day. I had no idea. Thank you.
Re: Okay, how would you attack this one?
by afoken (Chancellor) on Jun 02, 2016 at 19:46 UTC

    *.xlsx is a ZIP archive containing XML and some other files. From your error message, the "Excel File" does not look like a ZIP file. That matches the file extension *.xls, so it most likely is an old-style Excel (95/97) file, a binary blob. If the old-style tools (Spreadsheet::ParseExcel) don't accept it, try opening a copy in a plaintext editor (Notepad++, Ultraedit, joe, vi, emacs, nano). There are a lot of people on this planet that call every file readable by any version of Excel an "Excel file", including CSV files. At least some Excel versions also accept CSV files with an *.xls extension, IIRC. So people rename *.csv files to *.xls and "it just works". If you see data that makes some sense in the editor, try using Text::CSV_XS.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)