biswanath_c has asked for the wisdom of the Perl Monks concerning the following question:


Hi

I am trying to read a 120MB excel workbook which has 25 spreadsheets each containing 60K rows of data. The Parse method gives an "out of memory" error. Anybody knows why this might be? Is there some kind of limit on the size or the no. of spreadsheets within a workbook?


  • Comment on Reading huge spreadsheets using Spreadsheet::ParseExcel...

Replies are listed 'Best First'.
Re: Reading huge spreadsheets using Spreadsheet::ParseExcel...
by dasgar (Priest) on Nov 11, 2010 at 21:14 UTC

    You might want to check out the Reducing the memory usage section of the documentation for Spreadsheet::ParseExcel to see if that information might help you to modify your code in order to dodge the memory usage issue.

    (Of course, the author is an active user on Perl Monks and he might respond with more information and/or tips at some point.)

    The only alternative that I can think of would be to use Win32::OLE on a Windows system that has Excel installed.

      the link to "reducing the memory usage" was actually helpful; i followed the tips and it got rid of the "out of mem" error all right; but i stumbled on a diff. problem now; what do i do in the cell handler function so that i can read the cell contents using the get_cell() and value() functions?

      currently it looks like the cell handler blocks cell read alltogether. I am trying to read the cell contents and dump them into a database - the code was working fine for small spreadsheets; but now after the cell handler function, the table in the db comes out empty which obviously means the cell values are unable to be read.
      any tips?

        What do i do in the cell handler function so that i can read the cell contents using the get_cell() and value() functions?

        You don't need to use get_cell() since the cell_handler() callback function that you supply effectively replaces it.

        Once you have the cell object you can call value() (or any other method) on it in the same way you would with the standard interface. This is shown in the example that dasgar linked to:

        sub cell_handler { my $workbook = $_[0]; my $sheet_index = $_[1]; my $row = $_[2]; my $col = $_[3]; my $cell = $_[4]; # Do something useful with the formatted cell value print $cell->value(), "\n"; }

        --
        John.

Re: Reading huge spreadsheets using Spreadsheet::ParseExcel...
by sundeep (Acolyte) on Nov 11, 2010 at 21:32 UTC
    I too had some problems initially like this. Remember, .XLS doesn't support more than 65,536 rows, this is because of the limitation 2^16 . Only .XLSX can have more than 2^16 rows. But the problem is Spreadsheet::ParseExcel cannot read files in the Excel 2007 Open XML XLSX format. So, i think , you have to go for Spreadsheet::XLSX.