artist has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks
I have disucssed already in CB and was suggested to put it here. I have the simple code here to read an excel file and write the content into another excel file. Here for the sake of simplicity I am doing without any modification to illustrate the point.

In my input I have few cells have data seperated by a new line ( That's the way it looks). When I open the output file in Excel it looks like that it replaces that data with a single line with subtituting a square shaped character where there was new line in the input. When I 'click' on that with the mouse, it looks exactly same as in input.

If necessary, the data can be manipulted in the program to have it automatically display without clicking.

How I can achieve that..?

The code is

use strict; use warnings; use Spreadsheet::ParseExcel::Simple; use Spreadsheet::WriteExcel::Simple; #IN my $file = 'infile.xls'; my $xls = Spreadsheet::ParseExcel::Simple->read($file); #OUT my $ss = Spreadsheet::WriteExcel::Simple->new(); my $count; foreach my $sheet ($xls->sheets) { while ($sheet->has_data) { #Reading my @idata = $sheet->next_row; @odata = @idata; # Here I can manipulate the data #Writing $ss->write_row(\@odata); last if $count++ > 5; } } print $ss->data;
Thanks.
artist

2003-04-29 edit ybiC: retitle from "Spradsheet Parsing problem"

Replies are listed 'Best First'.
Re: Spradsheet Parsing problem
by bart (Canon) on Apr 29, 2003 at 19:54 UTC
    As we already established on the Chatterbox... the square is the appropriate newline character: "\x0A". The problem is that Excel doesn't treat it that way... unless you add a format to those cells, using addformat from the module Spreadsheet::WriteExcel, and call the method set_text_wrap() on it.

    Now I've had to cheat a little, and check the source for Spreadsheet::WriteExcel::Simple, in order to figure out what would need to be added to your code. Note that though I did test it with Spreadsheet::WriteExcel, I couldn't actually test your final code... I can just hope it works.

    First, to create the format, add this somewhere under your new() call:

    my $fmt = $ss->book->addformat; $fmt->set_text_wrap;
    And to actually add the row, change the call to:
    $ss->write_row(\@odata, $fmt);
    i.e. add $fmt as a second parameter.
Re: Spreadsheet Parsing problem
by dragonchild (Archbishop) on Apr 29, 2003 at 18:02 UTC
    <guess>

    If the spreadsheets are on a Win32 platform and the Perl is on a Unix platform, the newline character ($/) might be getting munged. On Unix, it's \n and on Win32, it's \r\n (or \n\r?). Might want to check into that.

    </guess>

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      I added a bindmode STDOUT
      use strict; use warnings; use Spreadsheet::ParseExcel::Simple; use Spreadsheet::WriteExcel::Simple; binmode STDOUT; #IN my $file = 'infile.xls'; my $xls = Spreadsheet::ParseExcel::Simple->read($file); #OUT my $ss = Spreadsheet::WriteExcel::Simple->new(); my $count; foreach my $sheet ($xls->sheets) { while ($sheet->has_data) { #Reading my @idata = $sheet->next_row; my @odata = @idata; # Here I can manipulate the data #Writing $ss->write_row(\@odata); last if $count++ > 5; } } print $ss->data;
      and ran
      perl temp.pl >outfile.xls
      and oufile.xls opened fine. I tried a several different versions of infile.xls with blank rows and such but it seemed to work fine.

      --

      flounder

      They both are on windows NT 4.0 platform.
      version info:
      perl:  v5.6.0 (Activestate) build 623
      Spreadsheet-ParseExcel-Simple 1.01 
      Spreadsheet-WriteExcel-Simple 0.03
      
      artist

      If you have to start talking about the distinctions between bytes, don't refer to the semantic names "\r" and "\n". Instead, start discussing them with hex or octal literal values and specify the binary mode.

      On Unix, ("\n" eq "\x0A"). On Windows, perl will memorize it the same way, except treat any ("\n" eq "\x0D\x0A") on input or output, doing conversions if the stream is not binmode().

      --
      [ e d @ h a l l e y . c c ]

Re: Spreadsheet Parsing problem
by jmcnamara (Monsignor) on Apr 29, 2003 at 21:41 UTC

    Excel wraps text in a cell using the newline character "\n" and a format. The part that is missing from your code is the application of the format.

    To fix this you need to access the underlying Spreadhsheet::WriteExcel workbook object and add a "text wrap" format as bart has shown above. There is some additional information here.

    Another solution would be to use the Spreadhsheet::ParseExcel::SaveParser module which is a wrapper for Spreadhsheet::ParseExcel and Spreadhsheet::WriteExcel and which preserves cell formats when reading and writing a file.

    --
    John.

Re: Spradsheet Parsing problem
by Jenda (Abbot) on Apr 29, 2003 at 21:47 UTC

    It seems this is something worse than we all expected. I tried to modify the code slightly and replaced the last line with

    open OUT, '>', 'c:\temp\xls\Out.xls' or die "Can't create the file: $! +\n"; binmode OUT; print OUT $ss->data; close OUT;
    but still the resulting XLS doesn't show right.

    If I print the @idata in hexa there IS only LF, and if I look with a hexa editor on both the original XLS and the created one BOTH contain just LF between the lines yet ... one works and the other does not.

    This starts to look like a bug in Spreadsheet::WriteExcel::Simple or Spreadsheet::WriteExcel to me :-(

    Jenda
    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature


      This starts to look like a bug in Spreadsheet::WriteExcel::Simple or Spreadsheet::WriteExcel to me

      The following example might clarify what is happening:

      #!/usr/bin/perl -w use strict; use Spreadsheet::WriteExcel; my $workbook = Spreadsheet::WriteExcel->new("reload.xls"); my $worksheet = $workbook->addworksheet(); my $format = $workbook->addformat(text_wrap=> 1); my $str = "A\nB\nC"; $worksheet->write('B2', $str ); $worksheet->write('B4', $str, $format);

      This produces an Excel file that looks a little like this except the square described by artist is shown as an underline:

      ------------------------------------------ | | A | B | C | D | ... ------------------------------------------ | 1 | | | | | ... | 2 | | A_B_C | | | ... | 3 | | | | | ... | | | A | | | ... | | | B | | | ... | 4 | | C | | | ... | 5 | | | | | ... | 6 | | | | | ... |...| ... | ... | ... | ... | ...

      The difference is caused by the presence of the format (which is missing from artist's code). This can also be verified within Excel by toggling the Format->Cells->Alignment->Wrap Text option.

      As such, I don't think that this is a bug. :-)

      And just to clarify a point raised in some of the other nodes. The character required to generate a wrap is \n (0x0A) and not \r (0x0D). Here is a hexdump of the wrapped ABC string taken from a real Excel file:

      41 0a 42 0a 43
      --
      John.