Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

HI I am having problem reading a MS-excel files with embedded Macro's in it by a Perl script. This MS-Excel file is located on a Unix Server. Please help Thank you
  • Comment on Problem reading a MS-excel files with embedded Macro's

Replies are listed 'Best First'.
Re: Problem reading a MS-excel files with embedded Macro's
by jmcnamara (Monsignor) on Jan 25, 2002 at 13:33 UTC

    I'm guessing that you are using Spreadsheet::ParseExcel which in turn uses OLE::Storage_Lite.

    OLE::Storage_Lite has known issues with Excel files that contain macros. From the documentation:

    CAUTION A saved file with VBA (a.k.a Macros) by this module will not work co +rrectly.

    Perhaps, you can access the file via ODBC or ADO either locally or remotely. I've never done this on Unix however. Maybe someone else can suggest how.

    --
    John.

Re: Problem reading a MS-excel files with embedded Macro's
by JayBonci (Curate) on Jan 25, 2002 at 14:47 UTC
    Okay, you have a few other options than reading the spreadsheet directly. You can try to just strip it of the data, which shouldn't be so terrible, or you can surgically remove the macro from the file.

    The best way I've found to parse through Microsoft Office files is to deal with them in their HTML exports. Microsoft has gone through a hell of a lot of trouble getting these files to preserve ALL data in their HTML format, and that includes any and all VB Macros. Therefore, you if you can export the file to html, you may be good to go.

    Systematically, if you save an macro-bearing Excel file as .html, you are going to have a few files, the first being something titlename.html and then a folder anmes titlename_files in the titlename_files folder is where the gold is. There are any number of MIME encoded embedded objects in that folder. You can tell what is available by checking on in the filelist.xml file provided by the Microsoft exporter. Look for the following line.

    <o:File HRef="editdata.mso"/>
    (or any *.mso files). Those are ActiveMime encoded (try MIME perl modules to try and read them if you can, more likely than not, they are objectified macro files, but you may be able to unravel them from there. If you remove that line, and move the file out of that working folder, then it won't see it any more, and you'll be rid of the macro. If you reimport, and resave it, those objects will be gone.

    You may be able to do that programatically on a Windows machine before dumping it over to a UNIX system. To automate the process, try checking into these ProgIDs for COM objects for Excel:
    • Excel.Addin
    • Excel.Application
    • Excel.Application.9 (Office 2000, the version I have, is Office version 9)
    • Excel.Backup
    • Excel.Chart
    • Excel.Chart.5
    • Excel.Chart.8
    • Excel.CSV
    • Excel.Dialog
    • Excel.DIF
    • Excel.Macrosheet
    • Excel.Sheet
    • Excel.Sheet.5
    • Excel.Sheet.8
    • Excel.Template
    • Excel.VBAModule
    • Excelhtmlfile
    • Excelworksheet
    • ExcelChart
    This is a small cross section of my registry on a clean install of Office 2k. There are a bunch of other ProgIDs, but they don't seem relevant. This is getting slightly offtopic, but you can access Excel through Perl Via COM. Check ActiveState's documentation and Microsoft's under Office development. Fish around, and hopefully you can find the right OM calls to make your day a little easier.

    --jay