in reply to Re^2: Problems with XML encoding
in thread Problem with quotes, speciao characters and so on, reading a xml file

Hi

Hmm in that case I think I misunderstood your problem. Though I still think you should use some XML technology ;-) if you are doing simple substitutions, could you do it using XSLT?

However, perhaps your problem is not with XML representations but with reading Unicode in. Assuming you're using Perl v5.8-v5.10, how are you opening the file? You need to tell Perl the encoding - presumably UTF-8.

You can do this in a number of ways:
# use binmode on the filehandle open my $fh, '<', "file" or die "... $!"; binmode $fh, ':utf8'; # open $fh for reading UTF-8 open(my $fh, "<:encoding(UTF-8)", "file") or die "... $!"; # Use the open pragma to open all input files as UTF-8 # see http://perldoc.perl.org/open.html use open IN => ':utf8'; # or you can manually use ... $str = decode_utf8( $str ); # on each data item

In your case, easiest to use binmode on the filehandle - at least to find out if this is the problem.

There are many documents trying to explain unicode in Perl. I quite like this one. Be aware that unicode support and the surrounding issues have changed quite a lot with the versions. v5.6 is completely different to the above, for example.

FalseVinylShrub

Disclaimer: Please review and test code, and use at your own risk... If I answer a question, I would like to hear if and how you solved your problem.