Rufnex has asked for the wisdom of the Perl Monks concerning the following question:

Hiho,

I've a problem with XML::Simple. After parsing XML i got back wrong language sings. It seems the uft8 will not converted into latin1 ..

eg. MySQL für Dummies -------^^

This sign must be ü

I tried to use a code snippet like this (found here on perlmonks):

sub encode { # UTF-8 to latin1 regex from XML::TiePYX (thanks to mirod +) my($text) = @_; $text =~ s{([\xc0-\xc3])(.)}{ my $hi = ord($1); my $lo = ord($2); chr((($hi & 0x03) <<6) | ($lo & 0x3F)) }ge; return $text; }

This works on the local machine (win XP) but not on the real server with linux.

Do you have any Ideas or other solutions for this topic?

thx a lot Rufnex

Replies are listed 'Best First'.
Re: XML::Simple and UTF8
by fglock (Vicar) on Jul 03, 2003 at 14:17 UTC

    use Encode

    For example, to convert a string from Perl's internal format to iso-8859-1 (also known as Latin1), $octets = encode("iso-8859-1", $string);
Re: XML::Simple and UTF8
by grantm (Parson) on Jul 03, 2003 at 18:59 UTC

    If you're outputting the results as a web page, then why not just add the appropriate header and send utf8? Everyone will be using utf-8 eventually (it makes things much simpler) so why not go with the flow rather than fight the current :-)

Re: XML::Simple and UTF8
by bm (Hermit) on Jul 03, 2003 at 16:57 UTC

    Are you using the same version of Perl on both your Linux and Win32 machines?

    Version 5.8 of Perl has much more robust support for UTF8 encoded files. I would upgrade if you are using an earlier version.