in reply to XLS to CSV file with Text wrap

Note that a text-wrap in Excel does not mean that the data in the cell will contain new-lines when you read it with Spreadsheet::ParseExcel. It is a display-feature in Excel. When you want the CSV to do something similar, you need to read the column-width and wrap the text yourself with something like Text::Wrap. Note however that the default font in Excel is most likely a variable width font and not a mono-spaced font like on a terminal, so that your wrapped text might wrap on completely different points in the field content.

corion's solution is already helpful (but it skips undefined fields), and could correctly deal with wrapped text, but if the sheet is huge, you might want to stream

my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n", auto_diag = +> 1 }); open my $fh, ">", "file.csv" or die "file.csv: $!"; foreach my $row_index ($source_sheet->row_range) { my @row; foreach my $col_index ($source_sheet->col_range) { my $source_cell = $source_sheet->{Cells}[$row_index][$col_inde +x]; my $value; if ($source_cell) { local $Text::Wrap::columns = 20; $value = wrap ("", "", $source_cell->Value); } push @row, $value; } $csv->print ($fh, \@row); } close $fh;

Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: XLS to CSV file with Text wrap
by kshitij (Sexton) on Sep 02, 2018 at 04:13 UTC
    Hi Tux ,

    Thanks for providing the solution. I have tried it out but the generated CSV file is not getting generated properly. It is splitting most of the content in the newlines in the CSV file and coming in a single column .

    Please find my script .
    #!/usr/bin/perl -w #use Text::CSV_XS; use strict; use warnings; use lib '/home/users/kulshrk/perl_module/lib64/perl5'; use Spreadsheet::ParseExcel; use Text::Wrap; use Text::CSV_XS; #use Text::CSV_XS qw( csv ); my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n", auto_diag = +> 1 }); open my $fh, ">", "Project_status_tracking_DFT.csv" or die "Project_st +atus_tracking_DFT.csv: $!"; my $sourcename = shift @ARGV or die "invocation: $0 <source file>\n"; my $source_excel = new Spreadsheet::ParseExcel; my $source_book = $source_excel->Parse($sourcename) or die "Could not +open source Excel file $sourcename: $!"; my $storage_book; #my $sheet = []; foreach my $source_sheet_number (0 .. $source_book->{SheetCount}-1) { # my $source_sheet = $source_book->{Worksheet}[$source_sheet_number]; print "--------- SHEET:", $source_sheet->{Name}, "\n"; next unless defined $source_sheet->{MaxRow}; next unless $source_sheet->{MinRow} <= $source_sheet->{MaxRow}; next unless defined $source_sheet->{MaxCol}; next unless $source_sheet->{MinCol} <= $source_sheet->{MaxCol}; # my $sheet = []; foreach my $row_index ($source_sheet->{MinRow} .. $source_sheet->{Max +Row}) { my @row; foreach my $col_index ($source_sheet->{MinCol} .. $source_sheet->{Ma +xCol}) { my $source_cell = $source_sheet->{Cells}[$row_index][$col_index]; my $value; if ($source_cell) { local $Text::Wrap::columns = 20; $value = wrap ("", "", $source_cell->Value); } push @row , $value; } $csv->print ($fh, \@row); } } close $fh;

    Could you suggest something here ? I played with local $Text::Wrap::columns by changing the width but still issue is there .

    Thanks Kshitij
      Thanks for providing the solution. I have tried it out but the generated CSV file is not getting generated properly. It is splitting most of the content in the newlines in the CSV file and coming in a single column.

      And that is what it should look like when viewing as a plain text file. Now take a closer look at where all the quotes are located. CSV allows embedden newlines, which is what you asked for. CSV doe not support display properties, like OpenOffice, LibreOffice, Excel and other Spreadsheet GUI's do. It is plain text with control characters: EOL/EOR (end-of-line/end-of-record), SEP (separator), QUO (Quotation), and ESC (escape). Read the specs if you do not (yet) understand.

      A CSV parser (also builtin into spreadsheet GUI's) should be able to understand the structure and show you the data as you intended it to be.

      I start to suspect that you do not want CSV, but pre-formatted output in columns. That is not what CSV is for.


      Enjoy, Have FUN! H.Merijn
        Hi Tux ,

        I think you are right I am looking for a pre formatted xls into CSV and I think CSV is not used for that. I am changing my strategy. Now I am thinking to split the original XLS into multiple small XLS based on the pattern match and then format these XLS files based on other pattern match.

        To achieve this , I am trying below code. My intention is to split the big XLS file into small xls file based on the pattern match of the range of lines but while writing out the smaller xls based on above pattern match , I am getting empty xls.

        I am trying to match range of lines between /DFT Tasks for Ration/ and /END OF RATION/ and write it into separate xls.

        #!/usr/bin/perl -w use strict; use Spreadsheet::ParseExcel; use Spreadsheet::WriteExcel; my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->parse('Project_status_tracking_DFT.xls'); if ( !defined $workbook ) { die $parser->error(), ".\n"; } my $workbook1 = Spreadsheet::WriteExcel->new('Ration_Status.xls'); my @color = qw(white green orange pink blue magenta silver); my @format; for my $i (1..@color){ $format[$i] = $workbook1->add_format(bg_color => $color[$i]); $format[$i]->set_text_wrap(); } for my $worksheet ( $workbook->worksheets() ) { my $worksheet1 = $workbook1->add_worksheet(); my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); # set minimum column width my @width; $width[$_] = 10 for ($col_min .. $col_max); for my $row ( $row_min .. $row_max ) { my @value = (); my $fmt; # format for complete row # store row in array for my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); if (defined $cell) { if ( $cell =~ /DFT Tasks for Ration/ .. /END OF RATION/) { $value[$col] = $cell->value() ; if ($value[$col] =~ m/Completed/i){ $fmt = $format[1] } # determine max width for each column my $len = length($value[$col]); $width[$col] = $len if $len > $width[$col]; } } } # write row for my $col ( $col_min .. $col_max ) { $worksheet1->write($row, $col, $value[$col],$fmt); } } # set column widths for my $col ( $col_min .. $col_max ) { $worksheet1->set_column($col,$col,$width[$col]) } }
        Please provide your valuable inputs

        Thanks Kshitij

      Your script appears to be working correctly for me. Try this test program

      #!/usr/bin/perl use strict; use warnings; use Text::Wrap; use Text::CSV_XS; my $csv = Text::CSV_XS->new ( { binary => 1, eol => "\r\n", auto_diag => 1 }); open my $fh, ">", "wraptest.csv" or die "$!"; my @data = ( ['1a this is more than 20 characters', '', '1c and another one that is more than 20 characters'], ['2a this is more than 20 characters', '', '2c and another one that is more than 20 characters'], ); for my $row_index (0..1){ my @row; for my $col_index (0..2){ my $source_cell = $data[$row_index][$col_index]; my $value; if ($source_cell){ local $Text::Wrap::columns = 20; $value = wrap ("", "", $source_cell); } push @row , $value; } $csv->print ($fh, \@row); } close $fh;
      poj
        Hi Poj ,

        Thanks for providing this code . I have tested it , it seems I was looking for the pre-formatted xls content into CSV but I think that is not possible. Now I am splitting my consolidated XLS into multiple smaller XLS based on the pattern match and would like to color the xls based on some keywords in the smaller XLS.

        I am facing one issue while doing this , I have already raised this issue( not able to print the range of lines from big XLS ) in response to Tux reply. </p?

        Thanks

        Kshitij