in reply to Re^4: Slovenian characters problem
in thread Slovenian characters problem

Based on your description, it seems to make sense that the first $SheetOut->write() works as intended, because your script has the literal string stored in the encoding that works for your usage of Excel.

Regarding the second call to $SheetOut->write(), there are two issues you need to consider (and describe for us, if you still need us to help):

1. In what way is the result not correct? What do you actually get in place of the data you wanted to get? Is it partially bad, or completely bad? Do you see one or more question marks? Do you see nonsense characters?

2. What sort of data is actually stored in that element of the "$Test" AoH, which you want to put into your spreadsheet? Where does it come from? What other steps in your code have had an effect on that item of data before you pass it to $SheetOut->write() ?

For the second issue, it might suffice just to print out the contents of $Test[0]->{Col1} in some explicit way -- probably the best bet would be Data::Dumper...

use Data::Dumper 'Dumper'; ... print Dumper( $Test[0] ); # display contents of anon.hash, including +{Col1} ...
Based on what that shows, it might be fairly simple to figure out the cause of the trouble.

Replies are listed 'Best First'.
Re^6: Slovenian characters problem
by JackVanRamars (Novice) on Aug 18, 2010 at 05:32 UTC

    The first call $SheetOut->write(0,0,"Č芚Žž") produces the correct writeout in Excel xls table.

    In my script I open a simple Excel table and read its line where the same "Č芚Žž" characters are written, then I write them into Excel xls table:

    push(@Test,{Col1=>$Sheet->Cells(1,1)->{'Value'}}) $SheetOut->write(1,0,encode("cp1250",$Test[0]->{Col1}));

    Add 1: As a result the script writes the following characters into output table: "Čč" become "بو", "Šš" and "Žž" become four outlined squares. The characters are therefore completely different, I should perform REPLACE manually in Excel (which is out of question: doing that writing script would have no sense). They become question marks only in case if I skip the use encoding "cp1250".

    Add 2: In debugger I see @Test[0]->{Col1} which contains the correct characters "Č芚Žž". Data is read from a simple xls Excel table. The next step is writing into another xls Excel table (using Spreadsheet::WriteExcel. The script does nothing to @Test before writing it into table.

    I was curious what happens if I add the following four lines:

    1 $String="Č芚Žž"; 2 $Test2[0]->{Col1}="Č芚Žž"; ... 3 $SheetOut->write(2,0,$String); 4 $SheetOut->write(3,0,$Test2[0]->{Col1});

    The result of all additional lines from 1 to 4 in the output Excel table are correct written characters!

    So it seems to me that the only problem occurs by reading Slovenian characters from Excel table (via Win32::OLE). Later I'll check the contents of @Test using Data::Dumper (btw thanks for the tip!) and describe the results.

      So it seems to me that the only problem occurs by reading Slovenian characters from Excel table (via Win32::OLE).

      Since you are using Spreadsheet::WriteExcel and this is working okay on literal strings, you might want to try using Spreadsheet::ParseExcel to read your input Excel files (instead of Win32::OLE). I don't know how different the two approaches are, since I've never used the OLE module, but I've used ParseExcel, and it seems to work fine (on unix and linux, at least). I've had good experiences with it when the excel data included unicode characters, and maybe that's related to what is happening in your case, but I'm not sure.

      Later I'll check the contents of @Test using Data::Dumper

      That is certainly the next thing to try -- the sooner, the better. (BTW, I'm glad you're not doing anonymous posts anymore. Welcome to the Monastery!)

        I've added the following line into script:

        use Data::Dumper;

        After I read the xls table contents here is the second one:

        print "\@Test values are: ", Dumper(\@Test);

        I think I'll have to take your advice and give up on Win32::OLE since Data::Dumper reports this:

        Test values are: $VAR1 = [ { 'Col1' => '\x{00c8}\x{00e8}\x{008a}\x{009a}\x{008e}\x{009e +}' } ]; "\x{00e8}" does not map to cp1250 at C:\batch\Pisarna\Clouseau\Testira +nje.pl line 31. "\x{008a}" does not map to cp1250 at C:\batch\Pisarna\Clouseau\Testira +nje.pl line 31. "\x{009a}" does not map to cp1250 at C:\batch\Pisarna\Clouseau\Testira +nje.pl line 31. "\x{008e}" does not map to cp1250 at C:\batch\Pisarna\Clouseau\Testira +nje.pl line 31. "\x{009e}" does not map to cp1250 at C:\batch\Pisarna\Clouseau\Testira +nje.pl line 31.

        I've started with Win32::OLE and then found that I need to add Spreadsheet::WriteExcel as well because it controls Excel colors and other attributes. I hoped I'll use Tk module - but there are about 150 columns of (wide string) data so Tk is not an option (it's very useful for less data though).

        I will deinitely skip Win32::OLE and check how successfull am I with Spreadsheet::ParseExcel. I'll give notice when I make changes - in case some other poor fellow hits the same wall.

        Thanks for your help!

        P.S.: I forgot to log in earlier...