biswanath_c has asked for the wisdom of the Perl Monks concerning the following question:


I know how to do basic read operations on excel spreadsheets using PERL. I have one question on it though. How do i programatically find out what is the last row having data in a particular column in the spreadsheet? (i would like to find out the last row containing data in a particualr column while a few rows in the middle could be empty; i.e. in column C, row 500 could be the last row containing some valid data but rows 100, 200, 300 and 400 could be blank with no data. In this case, how do i programatically find out using PERL that row 500 is the last row in column C having some data?)


Replies are listed 'Best First'.
Re: Reading Excel spreadsheets...
by VinsWorldcom (Prior) on Jan 19, 2010 at 20:57 UTC

    Perhaps you could share how you "do basic read operations on excel spreadsheets using Perl"? I use Win32::OLE as I run on a Windows box and it provides more functionality in the creation of Excel objects than the Spreadsheet::* modules.

    If that's the case, you can use:

    my $MinRow = $inWks->UsedRange->Find({What=>"*", SearchDirection=>xlNe +xt, SearchOrder=>xlByRows})->{Row}; my $MaxRow = $inWks->UsedRange->Find({What=>"*", SearchDirection=>xlPr +evious, SearchOrder=>xlByRows})->{Row}; my $MinCol = $inWks->UsedRange->Find({What=>"*", SearchDirection=>xlNe +xt, SearchOrder=>xlByColumns})->{Column}; my $MaxCol = $inWks->UsedRange->Find({What=>"*", SearchDirection=>xlPr +evious, SearchOrder=>xlByColumns})->{Column};

    after you define an Excel OLE object, workbook and worksheet.


      THanks for the response! I use use Spreadsheet::ParseExcel module to do the read operations on spreadsheets.


Re: Reading Excel spreadsheets...
by Corion (Patriarch) on Jan 19, 2010 at 23:09 UTC

    Just look at the Spreadsheet::ParseExcel documentation:

    my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range();
Re: Reading Excel spreadsheets...
by jmcnamara (Monsignor) on Jan 20, 2010 at 11:18 UTC

    I would like to find out the last row containing data in a particualar column.

    That information isn't stored in an Excel file. Excel only stores the maximum and minimum row and column for the entire worksheet.

    If you want to find individual maximum values for a column with Spreadsheet::ParseExcel you will have to iterate through all the rows. Here is a generalised example.

    #!/usr/bin/perl -w use strict; use Spreadsheet::ParseExcel; my $parser = Spreadsheet::ParseExcel->new(); my $filename = $ARGV[0] or die "Parsing error: must specify filena +me.\n"; my $workbook = $parser->parse( $filename ); die "Parsing error: ", $parser->error(), ".\n" if !defined $workbo +ok; # Select a particular worksheet. my $worksheet = $workbook->worksheet( 0 ); # Get the worksheet cell range. my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); # Store the individual column maxima. my @col_maxes; for my $row ( $row_min .. $row_max ) { for my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); if ( defined $cell && $cell->value() ne '' ) { $col_maxes[$col] = $row; } } } # The rows and columns are zero indexed and columns that don't con +tain # data and thus don't have a maximum value will be undef. The foll +owing # prints out the data in Excel 1-indexed style. my $col_name = 'A'; for my $max ( @col_maxes ) { if ( defined $max ) { $max = $max + 1; } else { $max = 0; } print "Column '$col_name' max row = $max\n"; $col_name++; } __END__ For a spreadsheet like this: ________ _| Sheet1 |__________________________________________________ |_____________________________________________________________ +| | || | | +| | || A | B | C +| |_________||________________|________________|________________ +| | 1 || Foo | | +| |_________||________________|________________|________________ +| | 2 || | | Foo +| |_________||________________|________________|________________ +| | 3 || Foo | | +| |_________||________________|________________|________________ +| | 4 || | | Foo +| |_________||________________|________________|________________ +| Prints this: Columm 'A' max row = 3 Columm 'B' max row = 0 Columm 'C' max row = 4

    --
    John.