in reply to Excel Spreadsheet Data Set

This is a great solution if your site is hosted on a Lose321 system, but not as useful on UNIX/Linuxen.

More often, I find myself receiving data in spreadsheet form and having to somehow manipulate it in perl before inserting it into a database, or vice versa.

I have found it to be fairly cumbersome to save the individual worksheets from an Excel workbook file (*.xls) as CSV files. Once this is done, of course, using either DBI::CSV or just straight Perl to manipulate the data is simple and straightforward (and probably not worthy of further discussion).

The reverse direction is even more cumbersome: putting several CSV files into a workbook as separate worksheets.

Anyone know of a portable library (preferably with a Perl module interface) for directly manipulating Excel workbook files? It would have to work on UNIX (Solaris, and maybe Linux).

Thanks, dmm

Just call me the Anti-Gates ...

1 Lose32 - I first saw this term used by Al Stevens in his C Programming column in Dr. Dobbs Journal sometime in 2000, I forget exactly when.

Replies are listed 'Best First'.
Re: Re: Excel Spreadsheet Data Set
by jmcnamara (Monsignor) on Aug 23, 2001 at 02:07 UTC
    The reverse direction is even more cumbersome: putting several CSV files into a workbook as separate worksheets.

    The Spreadsheet::WriteExcel module can create a multiple worksheet Excel file on Unix. The standard distribution comes with an example program called csv2xls.pl which will convert CSV to an Excel file. There is also an example here.

    Anyone know of a portable library (preferably with a Perl module interface) for directly manipulating Excel workbook files? It would have to work on UNIX (Solaris, and maybe Linux).

    Takinori Kawai's DBD::Excel module provides a DBI wrapper around Spreadsheet::WriteExcel and Spreadsheet::ParseExcel on all platforms. In addition Spreadsheet::ParseExcel, by the same author, contains a module called Spreadsheet::ParseExcel::SaveParser which uses WriteExcel and ParseExcel to provide a round-trip access to Excel files.

    None of these solutions are 100% effective but they may serve your needs.

    John.
    --

Re: Re: Excel Spreadsheet Data Set
by dragonchild (Archbishop) on Aug 22, 2001 at 21:44 UTC
    The issue with directly manipulating Excel files is that you have to have a good parser and a good writer. Spreadsheet::ParseExcel works quite well as a Parser. We use it at work, quite effectively. (It's impossible to read his code, and there's numerous improvements that can be made to his alpha release, but that's neither here nor there.)

    However, Spreadsheet::WriteExcel has been known to have issues, especially when working with Spreadsheet::ParseExcel, primarily with formulas.

    That aside, those two should make a very good pair.

    ------
    /me wants to be the brightest bulb in the chandelier!

    Vote paco for President!


      However, Spreadsheet::WriteExcel has been known to have issues, especially when working with Spreadsheet::ParseExcel, primarily with formulas.

      "Issues" might scare away the punters. ;-)

      The situation is as follows. Excel writes a formula in two parts. The RPN binary encoding of the formula and the result of the calculation. Spreadsheet::WriteExcel only writes the encoded formula part because it wouldn't be possible to the calculate, a priori, the result of arbitrarily complex expressions. Equally Spreadsheet::ParseExcel only reads the result part of a formula because it isn't in a position to evaluate the formula. Therefore, when Spreadsheet::ParseExcel reads a formula written by Spreadsheet::WriteExcel it only sees a blank result.

      In the circumstances this is a best effort by both modules and most users would be unaffected by it.

      John.
      --

Re: Re: Excel Spreadsheet Data Set
by rob_au (Abbot) on Aug 22, 2001 at 21:48 UTC
    This is a great solution if your site is hosted on a Lose321 system, but not as useful on UNIX/Linuxen.
     
    Entirely untrue - This solution was specifically built so that the client could upload their Excel spreadsheet from their Win32 machine to the Unix web server where the data could be integrated into the web page.
     
    Anyone know of a portable library (preferably with a Perl module interface) for directly manipulating Excel workbook files? It would have to work on UNIX (Solaris, and maybe Linux).
     
    You might do good to take a further look into the Spreadsheet::ParseExcel module - This module provides excellent methods for extracting spreadsheet data and format information, irrelevant of the platform.
     

     
    Ooohhh, Rob no beer function well without!