in reply to Re^9: Converting Excel to Hash
in thread Converting Excel to Hash

Excel Format: ID Name Degree CLZ 1 Teja ph.d nit ph.d iit 2 Ravi B.tech iir desired format: $var1 ='1'; $var2=[ { clz=>nit, degree=>ph.d, name=>teja } ] $var1='1' $var2=[ { clz=>iit, degree=>ph.d, name=>teja } ] $var1='2' $var2=[ { clz=>iir, degree=>b.tech, name=>ravi. } ]

Replies are listed 'Best First'.
Re^11: Converting Excel to Hash
by Corion (Patriarch) on Jan 04, 2017 at 12:03 UTC

    When I posted my code, I added some debug output for when lines from the input are skipped by your code. Maybe you want to look at that debugging output to find when your code skips input.

    Currently your code does not handle the case of the id being missing in the input. Maybe you want to change that.

      use Spreadsheet::ParseExcel; use Data::Dumper; $filename="Book2.xls"; $e=new Spreadsheet::ParseExcel; $eBook=$e->Parse($filename); $sheets = $eBook->{SheetCount}; ($eSheet, $sheetName); foreach $sheet (0 .. $sheets - 1) { $eSheet = $eBook->{Worksheet}[$sheet]; $sheetName = $eSheet->{Name}; print "Worksheet $sheet: $sheetName\n"; %set =(); %data =(); foreach $row( 1 .. $eSheet->{MaxRow} ) { if (defined ($eSheet->{Cells}[$row][0] )) { $master_key=($eSheet->{Cells}[$row][0]->Value); $r=$row; } else { $master_key=$master_key; } foreach $col(1 .. $eSheet->{MaxCol}) { my $key=$eSheet->{Cells}[0][$col]->Value; if (defined $eSheet->{Cells}[$row][$col]) { $val=($eSheet->{Cells}[$row][$col]->Value); } else { $val=$eSheet->{Cells}[$r][$col]->Value; } $set{$key}=$val; } push @{ $data{$master_key}}, \%set; } } print Dumper %data;

      I think this can handle even if the data is missing.But the output is not coming.The output is coming as shown below.

      Excel Format: ID Name Degree CLZ 1 Teja ph.d nit ph.d iit 2 Ravi B.tech iir output: $var1='1' $var2=[ { clz=>iir, degree=>b.tech, name=>ravi. }, $var2->[0]; ] $Var3='2'; $Var4=[ $var2->[0] ];

        You might find the task easier if you split it into 2 steps, first create an array with the missing data filled down and then create the hash. For example

        #!perl use strict; use Spreadsheet::ParseExcel; use Data::Dumper; my $filename = "Book2.xls"; my $e = new Spreadsheet::ParseExcel; my $eBook = $e->Parse($filename); my $eSheet = $eBook->{Worksheet}[0]; my ( $row_min, $row_max ) = $eSheet->row_range(); my ( $col_min, $col_max ) = $eSheet->col_range(); # build array filling in blanks my @row_array =(); my @data_array=(); for my $row ($row_min .. $row_max ){ for my $col ($col_min .. $col_max){ my $cell = $eSheet->get_cell( $row, $col ); if ($cell && $cell->value ne ''){ $row_array[$col] = $cell->value; } } $data_array[$row] = [@row_array]; } print Dumper \@data_array; # convert array to hash my %data = (); for my $row ($row_min+1 .. $row_max){ my %set = (); my $master_key = $data_array[$row][$col_min]; for my $col ($col_min+1 .. $col_max){ my $key = $data_array[$row_min][$col]; $set{$key}= $data_array[$row][$col] } push @{ $data{$master_key} }, \%set; } print Dumper \%data;
        update : push @data_array,[@row_array] changed
        to $data_array[$row] = [@row_array] to allow for $row_min not being 0.
        poj

        A cell can be defined but blank. Try this test code

        #!perl use strict; use Spreadsheet::ParseExcel; my $filename = "Book2.xls"; my $e = new Spreadsheet::ParseExcel; my $eBook = $e->Parse($filename); my $eSheet = $eBook->{Worksheet}[0]; for my $row (1 .. $eSheet->{MaxRow}){ for my $col (1 .. $eSheet->{MaxCol}){ if (defined $eSheet->{Cells}[$row][$col]){ my $val = $eSheet->{Cells}[$row][$col]->Value; print "$row $col = '$val'\n"; } else { print "$row $col Not defined\n"; } } }
        poj

        How/where do you think your code handles the case of missing data?

        I suggest that you add print statements to every line to see where your program progresses.

        Especially add print statements to the cases where valid data is found and where missing data is handled:

        ... if (defined ($eSheet->{Cells}[$row][0] )) { $master_key=($eSheet->{Cells}[$row][0]->Value); $r=$row; print "Found (new) master key '$master_key' in row $r\n"; } else { $master_key=$master_key; print "No (new) master key found in row $row, keeping current mast +er key '$master_key' from row $r\n"; } ... if (defined $eSheet->{Cells}[$row][$col]) { $val=($eSheet->{Cells}[$row][$col]->Value); print "Found valid value for '$key' in ($row,$col): '$val'\n"; } else { $val=$eSheet->{Cells}[$r][$col]->Value; print "Found invalid/empty value for '$key' in ($row,$col), reusing t +he value from ($r,$col): '$val'\n"; }

        Also, your code would greatly benefit from indenting every line by four spacess for every level of nesting. This makes it much easier for you and others to recognize the structure. Instead of

        foreach $row( 1 .. $eSheet->{MaxRow} ) { if (defined ($eSheet->{Cells}[$row][0] )) { $master_key=($eSheet->{Cells}[$row][0]->Value); $r=$row; } else { $master_key=$master_key; } ...

        write the code as

        foreach $row( 1 .. $eSheet->{MaxRow} ) { if (defined ($eSheet->{Cells}[$row][0] )) { $master_key=($eSheet->{Cells}[$row][0]->Value); $r=$row; } else { $master_key=$master_key; }

        Also see perlstyle and perltidy.