chafelix has asked for the wisdom of the Perl Monks concerning the following question:

I am struggling with non-ASCII characters. Specifically I must read a text file with non-ASCII characters (local alphabet). I can do that and they are stored in a datastructure i.e.

$myhash{$ascii_variable}=[@array_of_non-ascii];

All manipulations are done with the ascii key. The problem is when I write the processed output to xlsx, the non-ascii characters do not print out right. Again, I am not 'doing anything ' with them, just reading and writing them.

Usually when I had such issues, it was just a couple of such characters, which I just copied manually and copied to an excel file. Then when I was reading the excel file, and printing it, everything was fine. But this time these are too many.

So how should I deal with this?

Replies are listed 'Best First'.
Re: non-ASCII characters in text->Excel
by haukex (Archbishop) on Nov 18, 2019 at 11:12 UTC

    You'll have to tell us a little more about the problem - best would be a Short, Self-Contained, Correct Example that reproduces the issue, so that we can see the problem for ourselves, including what module(s) you're using, how you're handling the string, how you've verified that the encodings are correct, and so on.

      The quick and dirty way to solve this is the following: -Create an excel file, say 'Characters.xlsx' -On the first column put on each row every one of the characters that give you problem -read this file with perl and store each character in a variable -whenever you need in your code to use this, use the variables you have stored when reading this excel file
      use Spreadsheet::ParseXLSX; my $infile='Characters.xlsx'; my @funny_characters=(); my $parser = Spreadsheet::ParseXLSX->new( ); my $workbook = $parser->parse($infile); for my $worksheet ( $workbook->worksheets() ) { my ( $row_min, $row_max ) = $worksheet->row_range(); my $row=0; while($row<=$row_max){ my $W=$worksheet->get_cell( $row,0);#assuming all funny characters on +the first column my $val=''; if( $W){$val=$W->value(); } push @funny_characters,$val; $row++; } } #so at the end, @funny_characters[0] will have the first funny charac +ter, funny_characters[1] the second in your Characters.xlsx file and +so on <\code> If you want to do something with those funny characters, you say <code> if($mytext=~/$funny_characters[0]/){.....}