When the first line iseverything works well<?xml version="1.0" encoding="ISO-8859-1" ?>
That's kind of like when the guy tells his doctor, "It only hurts when I to this...", to which the doctor replies, "Well, don't do that. (That'll be $50 for the visit.)"
Why assert that the xml file is utf8 when it's actually iso-8859-1? Is there a reason why you would want the xml file to really be utf8? Or maybe what you want is, after reading an iso-8859-1 xml file, to output something as utf8 data?
If you really want utf8 data in your xml, you might need to tell us more about how you are writing the xml file. If you just want to read the xml file as-is and output utf8 data, that's easy. After reading/parsing the xml file correctly, perl has the text stored internally (in memory) as utf8 strings.
(update: I'm not actually sure whether a non-utf8 xml file would automatically be converted to utf8 strings upon being parsed; you might need to explicitly "decode" the text in order to convert it to utf8; in that case, since you already know what the original (non-unicode) character set is, converting to utf8 is still really simple -- refer to the Encode module. Then, to output the data as utf8, ...)
Just set whatever output file handle to utf8 mode in order to print the text as utf8 data:
(where the first arg to binmode could be STDOUT, or any similar file handle that you've opened for output).binmode $output_file_handle, ":utf8";
In reply to Re: XML File Encoding and Parsing Problem
by graff
in thread XML File Encoding and Parsing Problem
by merrymonk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |