in reply to Re^5: Append new line in excel sheets
in thread Append new line in excel sheets

Hi CountZero, Sorry I didn't respond your query. Here is my actual problem: I'm converting Excel(.xls, .xlsx) files to .csv files. Since a single file contains around 10k lines so on calling parse in the below code:

my $parser = Spreadsheet::ParseExcel->new(); my $Book = $parser->parse($in_file);

memory out of bound error comes on command prompt, but I figured out to resolve this using cell_handler. But for few excel files cell handler skips the last row of the excel files which contain tables. After doing some R&D I found out that after adding a row in the end of file resolves the problem. In order to make it work I have to append a new row in the end. I don't want to use OLE as on our production system MS Office will not be installed. So kindly suggest me something. In response of your question: Yes the program fails, It throws memory out of bound on command prompt. And when I monitor the memory usage it hits 1GB mark for this size. The version of Excel files I'm working on is Microsoft Office Standard 2010. They might be of older version also as we get these excel files from some other resources which can not be controlled by us. In case I missed to mention something kindly point out. I will be very thanks... I'm so stuck here.

Replies are listed 'Best First'.
Re^7: Append new line in excel sheets
by roboticus (Chancellor) on Jul 23, 2014 at 11:49 UTC

    perl_new_b:

    I should've read the entire thread before posting my last response. If you're just trying to convert the spreadsheets to .CSV files and add some data to the end, then it's really pretty simple. Writing a new cell handler (as described in the section "Reducing the memory usage of Spreadsheet::ParseExcel" of the Spreadsheet::ParseExcel documentation is pretty easy--all you need to do is store the value, not the formatting, etc. That will make the program consume *much* less RAM. (The cell handler is what builds the *huge* data structure containing all the formatting information, the cell value, unformatted value, etc. Also, each cell is an object containing other overhead such as references to the parent structures, miscellaneous information, etc.

    It should be as easy as:

    #!/usr/bin/perl -w use strict; use Spreadsheet::ParseExcel; # 3D array for the workbook data: # [0]=sheet #, [1]=row, [2]=col my @WBData; my $parser = Spreadsheet::ParseExcel->new( CellHandler => \&cell_handler, NotSetCell => 1 ); my $workbook = $parser->parse('file.xls'); sub cell_handler { my $workbook = $_[0]; my $sheet_index = $_[1]; my $row = $_[2]; my $col = $_[3]; my $cell = $_[4]; # Store *just* the value into our array $WBData[$sheet_index][$row][$col] = $cell->value(); }

    Note: this is untested, it's just lifted straight from the example in the previous link with a couple minor edits.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Hi robotics, Thanks a lot. but kindly can u guide, actually i got confused at where you are appending a new row at the end in the code mentioned by you.

        perl_new_b:

        The bit of code I showed doesn't append a new row. Instead, it creates an array with the data in the original Excel spreadsheet. All you need to do is to append the data to the appropriate places in the array, and then write the data. Excel can read a .CSV file, so I'd just use Text::CSV to write the data file. (You can even name it with a .xls extension and Excel will read it nicely. Remember, though, there will be no formatting or charts.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

        Hi @roboticus, I want to append a new line at the end of a excel file(.xls to be precise).

        Kindly guide me for the above. Thanks

Re^7: Append new line in excel sheets
by CountZero (Bishop) on Jul 15, 2014 at 16:52 UTC
    This is a "catch 22" situation: using a plain Spreadsheet::ParseExcel (as shown in my example) is not possible because of memory constraints and using Spreadsheet::ParseExcel with a cell handler fails on some spreadsheets.

    To make these spreadsheets work, you have to add an extra line at the end, but you can only do that by using Spreadsheet::ParseExcel which won't work (see above).

    I'm afraid you are stuck. :(

    The only solution I see is to contact the author of Spreadsheet::ParseExcel and inform him of the "bug" in the cellhandler routine which fails to find all cells and see if he has a solution.

    That, or use a machine which is less loaded or has more memory.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

      Hi CountZero thanks a lot for this very useful suggestion :) I surly will report this as a bug on their forum.

        If I installed Excel Viewer, then would it be possible to use WIn32::OLE?? Currently, I'm using :

        my $Excel = Win32::OLE->new('Excel.Application') or die "oops\n";

        But for the above case of Excel.Application it dies, can I use some other call for that instead of Excel.Application???