Thanks for the suggestion.
I created a small test spreadsheet with two entries:
Fundación
ФОРСУНОК
The Encoding method returns 1 (8bit ASCII or single byte UTF-16) for the Spanish text and 2 (UTF-16BE) for the Russian text.
I also modified the TextFmt routine in FmtDefault.pm to print the value of the parameter $sCode. It was undef for the Spanish text and UTF16-BE for the Russian text. So the routine just returns the Spanish text since $sCode is undef, but formats the Russian text (which gets mangled) as UTF16-BE.
sub TextFmt($$;$) {
my($oThis, $sTxt, $sCode) =@_;
if((! defined($sCode)) || ($sCode eq '_native_')) {
print STDERR "$sTxt/sCode " . (defined($sCode) ? "is _native_"
+ : "undefined") . " - returning text\n";
return $sTxt;
};
# Handle utf8 strings in newer perls.
if ($] >= 5.008) {
require Encode;
print STDERR "$sTxt/$sCode; returning text with UTF-16BE encod
+ing\n";
return Encode::decode("UTF-16BE", $sTxt);
}
print STDERR "$sTxt/$sCode; formatting with pack/unpack\n";
return pack('U*', unpack('n*', $sTxt));
#return pack('C*', unpack('n*', $sTxt));
}
|