Fellow monks,

My SaaS vendor is supplying me with an download file apparently in the format of Excel 2004 XML. I need to convert this to either a vanilla .xls format or a xlsx format (don't care which) so I can upload it into another database for analysis. I am having a devil of a time parsing this file format (and I might be wrong about what it truly is) using any of the typical Spreadsheet::<something> modules.

I have tried Spreadsheet:ParseExcel, Spreadsheet::XLSX (which is discontinued I think), Spreadsheet::Read, and Spreadsheet::ParseXLSX. All fail to open the file. I mostly get:

format error: can't find EOCD signature at /opt/local/lib/perl5/site_perl/5.22/Archive/Zip/Archive.pm line 71 +8. Archive::Zip::Archive::_findEndOfCentralDirectory(Archive::Zip::Ar +chive=HASH(0x7f883a82c478), IO::File=GLOB(0x7f883e43f2d8)) called at +/opt/local/lib/perl5/site_perl/5.22/Archive/Zip/Archive.pm line 591 Archive::Zip::Archive::readFromFileHandle(Archive::Zip::Archive=HA +SH(0x7f883a82c478), IO::File=GLOB(0x7f883e43f2d8), "/Users/coblem/Dow +nloads/KeywordExposure-EMC (Owner)-February "...) called at /opt/loca +l/lib/perl5/site_perl/5.22/Archive/Zip/Archive.pm line 559 Archive::Zip::Archive::read(Archive::Zip::Archive=HASH(0x7f883a82c +478), "/Users/coblem/Downloads/KeywordExposure-EMC (Owner)-February " +...) called at /opt/local/lib/perl5/site_perl/5.22/Spreadsheet/ParseX +LSX.pm line 56 Spreadsheet::ParseXLSX::parse(Spreadsheet::ParseXLSX=HASH(0x7f883a +805680), "/Users/coblem/Downloads/KeywordExposure-EMC (Owner)-Februar +y "...) called at excel_parsing.pl line 10 Can't open file '/Users/coblem/Downloads/KeywordExposure-EMC (Owner)-F +ebruary 2016-05312016_182543.xls' as a zip file at /opt/local/lib/per +l5/site_perl/5.22/Spreadsheet/ParseXLSX.pm line 56.

I can open the file with Excel, and do a save As. I'm not sure if perl can tell Excel to do a SaveAs from outside the program control (I suppose VB could do it but I don't want to learn VB to do this - I would not be able to maintain the script).

So. How would you attack the problem? My current approach is to open the file, slurp it in, then spit it out. But if I could get Excel to do it, I would be just as happy. I don't see how to control excel to do that, however. And the final operating environment will be windows server, although I do all my unit testing on a mac.

A search for program control of excel did not turn up anything I could use.

Your advice?

Thanks in Advance,


In reply to Okay, how would you attack this one? by mcoblentz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.