nayakr has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have written script to convert xls/xlsx file to csv based on the documentation found on the CPAN documentation. Using Spreadsheet::XLSX, Text::Iconv and Spreadsheet::ParseExcel modules. When I execute the script by passing a xls file, it is getting converted to csv successfully but when I execute the same with xlxx file, the converted cdv file has TRUE/FALSE fields converted to number. Can someone in this forum please help or give me some direction to fix this issue. Appreciate your help. Thanks, Raj

  • Comment on string TRUE/FALSE getting converted to 1/0

Replies are listed 'Best First'.
Re: string TRUE/FALSE getting converted to 1/0
by davido (Cardinal) on Jun 14, 2015 at 00:48 UTC

    Show us in 20-30 lines of code the behavior that is causing you difficulty.

    It sounds to me as though your string-based boolean values (TRUE/FALSE) are being upgraded to Perl booleans, which isn't surprising, but may not be what you want to have happen. So let's see some code and perhaps we can help you to get the behavior you want.


    Dave

      Hi Dave, Here is my code:

      #!/usr/local/bin/perl # Declarations #=========================================================== use File::Basename; use Spreadsheet::ParseExcel; use Text::Iconv; use Spreadsheet::XLSX; use IO::Handle; use encoding 'UTF8'; # Program proper # Check the input file provided is a xls or xlsx #=========================================================== my $input_file_name = shift || usage(); ($file,$dir,$ext) = fileparse($input_file_name, qr/\.[^.]*/); $output_file_name = "$dir/"."$file".".csv"; open FILE, ">$output_file_name"; select FILE; #=========================================================== parse_excel($input_file_name); exit 0; # Subroutines #=========================================================== sub parse_excel { if ($ext eq ".xls") { my %arg = ( record_sep => "\n", # field_sep => "\t", field_sep => ",", xls => undef ); if ( @_ == 1 ) { $arg{xls} = shift; } elsif ( not @_ % 2 ) { %arg = ( %arg, @_ ); } -e $arg{xls} or die "Must provide valid XLS file! $arg{xls}, $!\n"; # create a ParseExcel object my $excel_obj = Spreadsheet::ParseExcel->new(); my $workbook = $excel_obj->Parse($arg{xls}); # make sure we're in business die "Workbook did not return worksheets!\n" unless ref $workbook->{Worksheet} eq 'ARRAY'; # we need to get each worksheet from the workbook in turn # for my $worksheet ( @{$workbook->{Worksheet}} ) my $worksheet = $workbook->Worksheet(0); { # {Cells}[row][col] # empty worksheets have undef for MaxCol and MaxRow my $last_col = $worksheet->{MaxCol} || 0; my $last_row = $worksheet->{MaxRow} || 0; for my $row ( 0 .. $last_row ) { for my $col ( 0 .. $last_col) { my $cell = $worksheet->{Cells}[$row][$col]; # print ref $cell ? # $cell->Value : '' ; $str = ref $cell ? $cell->Value : ''; $str =~ s/\s+$//g; $str =~ s/\"//g; $str =~ s/\r\n//g; $str =~ s/amp\;//g; $str =~ s/ \d\d\:\d\d\:\d\d//g; if ($str =~ /(\d*\d)-(\d*\d)-(\d\d)/) { ($m, $d, $y) = split (/-/, $str); #if ($y eq '00') {$y = 0}; if ($y > 50) { $y += 1900; } elsif ($y < 50) { $y += 2000; } $str = sprintf '%d/%d/%d', $m, $d, $y; } if ($row == 0) { print $str; } else { print "\"$str\""; } print $arg{field_sep} unless $col == $last_col; } print $arg{record_sep} ; # record ends } print "\n" ; # worksheet ends } } elsif ($ext eq ".xlsx") { my $converter = Text::Iconv -> new ("utf-8", "windows-1252"); my %arg = ( record_sep => "\n", # field_sep => "\t", field_sep => ",", xls => undef ); if ( @_ == 1 ) { $arg{xls} = shift; } elsif ( not @_ % 2 ) { %arg = ( %arg, @_ ); } -e $arg{xls} or die "Must provide valid XLS file! $arg{xls}, $!\n"; my $excel = Spreadsheet::XLSX -> new ($input_file_name, $converter); # my $excel = Spreadsheet::XLSX -> new ($input_file_name, ""); # foreach my $sheet (@{$excel -> {Worksheet}}) my $sheet = $excel -> Worksheet(0); { # printf("Sheet: %s\n", $sheet->{Name}); $sheet -> {MaxRow} ||= $sheet -> {MinRow}; foreach my $row ($sheet -> {MinRow} .. $sheet -> {MaxRow}) { $sheet -> {MaxCol} ||= $sheet -> {MinCol}; foreach my $col ($sheet -> {MinCol} .. $sheet -> {Max +Col}) { my $cell = $sheet->{Cells}[$row][$col]; #print ref $cell ? # $cell->Value : ''; $str = ref $cell ? $cell->Value : ''; $str =~ s/\s+$//g; $str =~ s/\"//g; $str =~ s/\r\n//g; $str =~ s/amp\;//g; $str =~ s/ \d\d\:\d\d\:\d\d//g; if ($str =~ /(\d*\d)-(\d*\d)-(\d\d)/) { ($m, $d, $y) = split (/-/, $str); #if ($y eq '00') {$y = 0}; if ($y > 50) { $y += 1900; } elsif ($y < 50) { $y += 2000; } $str = sprintf '%d/%d/%d', $m, $d, $y; } if ($row == 0) { print $str; } else { print "\"$str\""; } print $arg{field_sep} unless $col == $sheet -> {MaxCol}; } print $arg{record_sep}; # record ends } print "\n"; # worksheet ends } } elsif ($ext ne ".bubba") { print "This extension is not supported at this time"; print "\n"; } } CLOSE FILE; #=========================================================== sub usage { my ( $tool ) = $0 =~ m,([^\/]+$),; print <<HERE; ------------------------------------------------------------ USAGE: $tool EXCEL_FILE.xls || EXCEL_FILE.xlsx Takes an Excel file either with a xls or xlsx extension, parses it into comma delimited fields and rows, and sends the results to a filename with .csv extension. ------------------------------------------------------------ HERE exit 0; }
      Appreciate your time and help. Thanks, Raj

        One could ask "Is there some part of '20-30 lines' you don't understand?"

        Immediate apologies included: preceding is for emphasis (despite its regretable hint of rudeness). But perhaps that will also emphasize that boiling down your code to a snippet of 20-30 lines in length (which snippet still displays your problem) will not only make it easier for us to help you (and since you're asking for free help, that's a very good idea) but may also help you directly because confining the error to such a short snippet may make the cause obvious to you!.

Re: string TRUE/FALSE getting converted to 1/0
by Anonymous Monk on Jun 16, 2015 at 01:58 UTC

    Hi,

    If you just want to convert the file to csv and you are working on Windows, you might like to look at the SaveAs function in Win32::OLE.

    J.C.

Re: string TRUE/FALSE getting converted to 1/0
by locked_user sundialsvc4 (Abbot) on Jun 15, 2015 at 00:50 UTC

    One question that pops into my mind right now is ... “how does Excel decide that the value should be displayed as True or False,” and, “exactly what underlying value is being interpreted in this way?   Perhaps the actual value within the cell is a numeric 1/0, such that what the CSV is providing to you (namely, “the underlying cell-value, stripped of all formatting”) is, in fact, correct?

      Thanks Sundialsvc4 for your response. The Spreadsheet module (or one of the other module that I am using) seems to be causing this since it is working fine when I pass in a xls file. Is there some way I can tell the interpreter not to convert these strings when a XLSX file is passed? I tried using
      $str =~ s/TRUE/TRUE/i; $str =~ s/FALSE/FALSE/i;
      but that did not help. Thanks, Raj