Thanks for your hint, unfortunatelly my interpretation of it wasn't succesful.
I apologize, I'm new to this editor so the code was "lost".
#!/usr/bin/perl
use strict;
use Pod::Usage;
use Getopt::Std;
use Win32::OLE;
use Spreadsheet::WriteExcel;
use encoding "cp1250";
use Encode;
my $Excel;
my $Book;
my $Sheet;
my @Test;
my $Test;
@Test=encode("cp1250",@Test);
#Excel file FileToRead.xls contains the following Slovenian characters:
# Line 1 Column 1: Čč (capital and small letter C with caron)
# Line 1 Column 2: Šš (capital and small letter S with caron)
# Line 1 Column 3: Žž (capital and small letter Z with caron)
$Excel=Win32::OLE->GetActiveObject('Excel.Application') || Win32::OLE->new('Excel.Application', 'Quit');
$Book=$Excel->Workbooks->Open('c:\batch\Pisarna\Clouseau\FileToRead.xls');
$Sheet=$Book->Worksheets(1);
# characters read into @Test are correct
push(@Test,{
Col1=>$Sheet->Cells(1,1)->{'Value'},
Col2=>$Sheet->Cells(1,2)->{'Value'},
Col3=>$Sheet->Cells(1,3)->{'Value'},
});
$Book->Close;
$Excel->Close;
my $BookOut=Spreadsheet::WriteExcel->new('c:\batch\Pisarna\Clouseau\FileWritten.xls');
my $SheetOut=$BookOut->add_worksheet('test');
# writing Slovenian characters directly
$SheetOut->write(0,0,"Čč");
$SheetOut->write(0,1,"Šš");
$SheetOut->write(0,2,"Žž");
# read from xls file through @Test
$SheetOut->write(1,0,"$Test1->{Col1}");
$SheetOut->write(1,1,$Test1->{Col2});
$SheetOut->write(1,2,$Test1->{Col3});
# strings defined in above lines 41, 42 and 43 are written into FileWritten.xls correct:
# Line 1 Column 1: Čč (capital and small letter C with caron)
# Line 1 Column 2: Šš (capital and small letter S with caron)
# Line 1 Column 3: Žž (capital and small letter Z with caron)
# strings read into @Test are written into FileWritten.xls incorrect
# Line 2 Column 1: capital and small letter C with grave (can't be written)
# Line 2 Column 2: small outlined square (can't be written)
# Line 2 Column 3: small outlined square (can't be written)
_END_
In short, Slovenian characters written directly are correct but those read from xls file and written through @Test aren't correct. Something weird happens to them on their way from array through Spreadsheet::WriteExcel.
(I work in Windows XP Professional version 2002, I use Perl 5.8).
Thanks for your hint. | [reply] |
...
my @Test;
my $Test;
@Test=encode("cp1250",@Test);
...
So you must have misunderstood what I meant. Anyway, since your script includes use encoding "cp1250"; I gather that you wrote your script with a text editor that saves the file in that encoding. That should be fine, but it means that the quoted strings with accented characters are being treated internally in perl as utf8 strings (because that's what use encoding is supposed to do -- read the output of perldoc encoding).
So if you want these characters to be stored in the Excel file as cp1250 characters, I think you need to do your "write" calls like this:
use Encode;
...
$SheetOut->write(0,0, encode( "cp1250", "Čč" );
$SheetOut->write(0,1, encode( "cp1250", "Šš" );
$SheetOut->write(0,2, encode( "cp1250", "Žž" );
...
What happens in that case is: (1) your text editor saves the script as a cp1250-encoded text file, (2) when perl.exe reads the script to execute it, it sees use encoding "cp1250" and converts the special characters to its normal internal utf8 encoding (so that "character semantics" will work in the normal way), (3) then when those "write() functions are called, the Encode::encode function turns the utf8 strings back into cp1250 for storage in the Excel file.
At least, I think that's what should happen. Give it a try.
(update: the snippet I posted above is showing numeric character entities for some of the characters -- that was not intentional, but I'm not going to try to fix it -- you know which characters are supposed to be there.) | [reply] [d/l] [select] |
Thanks for the "code" suggestion.
I forgot to delete the contents of line 16 in the script before posting it. I apologize; without replacing [0] with 1 it wouldn't even write lines read from array.
My text editor is set to "Central European (cp1250)": when I open the script for example in Notepad++, MS Word and MS Excel the Slovenian characters are written correct.
I tried what you suggested:
$SheetOut->write(0,0,"ČčŠšŽž");
$SheetOut->write(1,0,encode("cp1250",$Test[0]->{Col1}));
In Xls file the first line is written correct, the second one not.
Why are characters defined as string (line 1) correct and those read from array (line 2) not correct?
Are there two or more different possible # characters in#!/usr/bin/perl and by Murphy I use the wrong one or something? Just kidding. :/
| [reply] [d/l] [select] |