socrtwo has asked for the wisdom of the Perl Monks concerning the following question:

Hi again monks. Thanks for the help so far. I'm using H.Merijn Brand's ss2tk to display my Excel worksheets, however I'm using Rob Polocz' Spreadsheet::XLSX::Utility2007 module's xls2csv sub (really xlsx2csv) to convert my data into csv files.

My problem is the code of the module, seems to round off data to 2 decimal places whereas the ss2tk script doesn't have this problem. The xls2csv sub also uses the Spreadsheet::XLSX module so maybe something is going on there. Can anybody explain why or suggest a workaround like using the ss2tk or Spreadsheet::Read's processing to create a new xlsx2csv function?

The xls2csv section of the Spreadsheet::XLSX::Utility2007 module is below. Note Rob Polocz kindly suggested replacing

my $oExcel = new Spreadsheet::XLSX ; my $oBook = $oExcel->Parse( $filename) ;
with this line in the code...
my $oBook = Spreadsheet::XLSX->new($filename);
in order to make the xls2csv function work in the first place. He is currently working on revising his module.
sub xls2csv { my ($filename, $regions, $rotate) = @_ ; my $sheet = 0 ; my $output = "" ; # extract any sheet number from the region string $regions =~ m/^(\d+)-(.*)/ ; if( $2) { $sheet = $1 - 1 ; $regions = $2 ; } # now extract the start and end regions $regions =~ m/(.*):(.*)/ ; if( !$1 || !$2) { print STDERR "Bad Params"; return "" ; } my @start = sheetRef( $1) ; my @end = sheetRef( $2) ; if( !@start) { print STDERR "Bad coordinates - $1"; return "" ; } if( !@end) { print STDERR "Bad coordinates - $2"; return "" ; } if( $start[1] > $end[1]) { print STDERR "Bad COLUMN ordering\n"; print STDERR "Start column " . int2col($start[1]); print STDERR " after end column " . int2col($end[1]) . "\n"; return "" ; } if( $start[0] > $end[0]) { print STDERR "Bad ROW ordering\n"; print STDERR "Start row " . ($start[0] + 1); print STDERR " after end row " . ($end[0] + 1) . "\n"; exit ; } # start the excel object now my $oBook = Spreadsheet::XLSX -> new($filename); # my $oExcel = new Spreadsheet::XLSX ; # my $oBook = $oExcel->Parse( $filename) ; # open the sheet my $oWkS = $oBook->{Worksheet}[$sheet] ; # now check that the region exists in the file # if not trucate to the possible region # output a warning msg if( $start[1] < 0) { print STDERR int2col( $start[1]) . " < min col " . int2col(0) +. " Resetting\n"; $start[1] = 0 ; } if( $end[1] > 701) { print STDERR int2col( $end[1]) . " > max col " . int2col(701) +. " Resetting\n" ; $end[1] = 701 ; } if( $start[0] < 0) { print STDERR "" . ($start[0] + 1) . " < min row " . (0 + 1) . +" Resetting\n"; $start[0] = 0 ; } if( $end[0] > 1048575) { print STDERR "" . ($end[0] + 1) . " > max row " . (1048575 + 1 +) . " Resetting\n"; $end[0] = 1048575 ; } my $x1 = $start[1] ; my $y1 = $start[0] ; my $x2 = $end[1] ; my $y2 = $end[0] ; if( !$rotate) { for( my $y = $y1 ; $y <= $y2 ; $y++) { for( my $x = $x1 ; $x <= $x2 ; $x++) { my $cell = $oWkS->{Cells}[$y][$x] ; $output .= $cell->Value if(defined $cell); $output .= "," if( $x != $x2) ; } $output .= "\n" ; } } else { for( my $x = $x1 ; $x <= $x2 ; $x++) { for( my $y = $y1 ; $y <= $y2 ; $y++) { my $cell = $oWkS->{Cells}[$y][$x] ; $output .= $cell->Value if(defined $cell); $output .= "," if( $y != $y2) ; } $output .= "\n" ; } } return $output ; }

Replies are listed 'Best First'.
Re: Why Does the Module(s) Round off to 2 Decimal Places?
by frieduck (Hermit) on May 02, 2009 at 23:58 UTC
    You're calling $cell->Value which gets the formatted value whereas $cell ->{Val} should get you the unformatted (full) value.
    ss2tk uses the unformatted value via the cell hash entry method (ex: $s->{cell}$_$r) of getting data from Spreadsheet::Read
      Thanks very much for your answer and that may be the crux of the matter, however, when I change $cell->Value to $cell ->{Val} in Spreadsheet::XLSX::Utility2007, nothing changes. I also tried {$cell}->Value

        Below is my changed code, which tests fine for me. I also had to add a "Use Spreadsheet::XLSX" to the top of Spreadsheet::XLSX::Utility2007, and, for some reason I don't understand, had to change line 136 of Spreadsheet::XLSX::Fmt2007 to "return Spreadsheet::XLSX::Utility2007::ExcelFmt($sFmtStr, $Dt, $Flg1904, $oCell->{Type});"

        if( !$rotate) { for( my $y = $y1 ; $y <= $y2 ; $y++) { for( my $x = $x1 ; $x <= $x2 ; $x++) { my $cell = $oWkS->{Cells}[$y][$x] ; #$output .= $cell->Value if(defined $cell); $output .= $cell->{Val} if(defined $cell); $output .= "," if( $x != $x2) ; } $output .= "\n" ; } } else { for( my $x = $x1 ; $x <= $x2 ; $x++) { for( my $y = $y1 ; $y <= $y2 ; $y++) { my $cell = $oWkS->{Cells}[$y][$x] ; #$output .= $cell->Value if(defined $cell); $output .= $cell->{Val} if(defined $cell); $output .= "," if( $y != $y2) ; } $output .= "\n" ; } }
        It turns out that is you change in the Spreadsheet::XLSX module:
        my $cell =Spreadsheet::ParseExcel::Cell->new( Val => $v, Format => $thisstyle, Type => $type );
        to
        my $cell =Spreadsheet::ParseExcel::Cell->new( Val => $v, Format => $thisstyle, Type => "" );
        It removes the formatting. There are still minor differences with what appears in the ss2tk display. The format that results from the change mentioned above seems to use the same sort of rounding that Excel itself does before displaying the raw data from worksheet#.xml files.

        For instance a piece of raw data displayed in the ss2tk table and in the XML worksheet file is 1.3318000000000001 and changing the Spreadsheet::XLSX routine and using the Utility2007::xls2csv function and Excel itself displays a rounding to 1.3318.

        Thanks again for the help.