in reply to Re: Excel and Perl
in thread Excel and Perl

#!/usr/bin/perl -w use strict; use Spreadsheet::ParseExcel; use List::Util qw(max); my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse('4_19_10_Zrc01a-2.xls'); if ( !defined $workbook ) { die $parser->error(), ".\n"; } for my $worksheet ( $workbook->worksheets() ) { my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); for my $row ( $row_min .. $row_max ) { for my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); next unless $cell; my $i=1; for ($col==5 and $cell->value()==$i){ if ($col==8){ print $cell->value(); print "\n"; }else { $i++; } } } } print "\n"; }
My sheets contain values in the manner of:
A B 1 3 1 2 1 4 2 5 2 7 3 10 3 0 3 3 . . 90 90
I need to find the maximum value in column B for every corresponding value of column A. For example: The maximum value for 1 in col A is 4 in col B. My experience in perl is less than 48 hours, any help will be appreciated. Thank you.

Replies are listed 'Best First'.
Re^3: Excel and Perl
by wind (Priest) on Jun 09, 2011 at 00:25 UTC

    Just put your data into a hash of arrays, and then loop on the compiled data using List::Util->max to determine the max value in each array.

    The below snippet demonstrates what I'm talking about with the fake data you listed:

    use List::Util qw(max); use strict; use warnings; my %vals; # Put Data into a Hash of Arrays while (<DATA>) { chomp; my ($key, $val) = split; push @{$vals{$key}}, $val; } for my $key (sort {$a <=> $b} keys %vals) { my $max = max @{$vals{$key}}; print "$key $max\n"; } =prints 1 4 2 7 3 10 =cut __DATA__ 1 3 1 2 1 4 2 5 2 7 3 10 3 0
      Is there a way to auto generate the "Hash of Arrays" from an excel sheet or would I be required to type them all in individually? The data is very high volume in each excel sheet and manually jotting them down to a program is definitely not an option. Thank you.

        Yes, you have all the resources you need to accomplish your goal. Just use Spreadsheet::ParseExcel to parse the spreadsheet, and use the code I demonstrated to process the data.

        You're going to have to do some learning to finish the project, but you have everything you need.

        Spreadsheet::Read reads whole spreadsheets in a single hash of lists (sheets) of lists (columns) of lists (rows), including meta-information my $ss = ReadData ("file.xls"); using Spreadsheet::ParseExcel (or any other supported spreadsheet parser).

        The first sheet is my $sheet = $ss->[1];.

        The formatted value of cell B3 can be accessed by $sheet->{"B3"}. The unformatted value by $sheet->{cell}[2][3].

        Does that help?


        Enjoy, Have FUN! H.Merijn
      use strict; use warnings; use Spreadsheet::Read; my $ss = ReadData ("4_19_10_Zrc01a-2.xls"); my $s = $ss->[1]; my %max; foreach my $r (1 .. $s->{maxrow}) { my ($A, $B) = ($s->{cell}[1][$r], $s->{cell}[2][$r]); $max{$A} //= $B; $max{$A} < $B and $max{$A} = $B; } print "$_\t$max{$_}\n" for sort { $a <=> $b } keys %max;
      $ perl test.pl 1 4 2 7 3 10 $

      Enjoy, Have FUN! H.Merijn
      use strict; use warnings; use Spreadsheet::ParseExcel; use Spreadsheet::Read; my $ss = ReadData ("4_19_10_Zrc01a-2.xls"); my $s = $ss->[1]; my %max; foreach my $r (1 .. $s->{maxrow}) { my ($A, $B) = ($s->{cell}[5][$r], $s->{cell}[8][$r]); $max{$A} //= $B; $max{$A} < $B and $max{$A} = $B; } print "$_\t$max{$_}\n" for sort { $a <=> $b } keys %max;
      the above code was given as a solution by wsfp, it doesn';t seemt to run for me and is giving me a long list of errors.
      Use of uninitialized value $A in hash element at D:\Programming\Perl\R +ead.plx li ne 14. Use of uninitialized value $A in hash element at D:\Programming\Perl\R +ead.plx li ne 15. Use of uninitialized value $B in numeric lt (<) at D:\Programming\Perl +\Read.plx line 15. Use of uninitialized value in numeric lt (<) at D:\Programming\Perl\Re +ad.plx lin e 15. Use of uninitialized value $A in hash element at D:\Programming\Perl\R +ead.plx li ne 14. Use of uninitialized value $A in hash element at D:\Programming\Perl\R +ead.plx li ne 15. Use of uninitialized value $B in numeric lt (<) at D:\Programming\Perl +\Read.plx line 15. Use of uninitialized value in numeric lt (<) at D:\Programming\Perl\Re +ad.plx lin e 15. Use of uninitialized value $A in hash element at D:\Programming\Perl\R +ead.plx li ne 14. Use of uninitialized value $A in hash element at D:\Programming\Perl\R +ead.plx li ne 15. Argument " Mapped_Aux_Number" isn't numeric in numeric lt (<) +at D:\Prog ramming\Perl\Read.plx line 15. Argument " Mapped_Aux_Number" isn't numeric in numeric lt (<) +at D:\Prog ramming\Perl\Read.plx line 15. Argument "Volt" isn't numeric in numeric lt (<) at D:\Programming\Perl +\Read.plx line 15. Argument "Volt" isn't numeric in numeric lt (<) at D:\Programming\Perl +\Read.plx line 15. Argument "Comments" isn't numeric in sort at D:\Programming\Perl\Read. +plx line 1 8. Argument "" isn't numeric in sort at D:\Programming\Perl\Read.plx line + 18. Argument "After Conference 3rd test for series.\r\nmass .0037" isn't n +umeric in sort at D:\Programming\Perl\Read.plx line 18. Comments Volt Mapped_Aux_Number After Conference 3rd test for series. mass .0037 0
      The above is the result after I run the program. thank you.
        foreach my $r (1 .. $s->{maxrow}) { my ($A, $B) = ($s->{cell}[5][$r] // 0, $s->{cell}[8][$r] // 0); $max{$A} //= $B; $max{$A} < $B and $max{$A} = $B; }

        will take away all those warnings. I bet you don't want columns "A" and "B" then, now do you? You still have to select the correct columns! Let's take columns "C" and "T" with the alternative approach. Then you start reading the documentation.

        foreach my $r (1 .. $s->{maxrow}) { my ($key, $value) = ($s->{"C$r"} // 0, $s->{"T$r"} // 0); $max{$key} //= $value; $max{$key} < $value and $max{$key} = $value; }

        Enjoy, Have FUN! H.Merijn