Hi, I've got 5 versions of code incorporating various suggestions made to me, none of which I can (yet) get to work on windows. The last version I have tested on a Unix machine and it worked OK. Trying to open this Unix created XML on windows results in it opening OK

#!/bin/perl -w use XML::DOM; my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ("c:\\accentTest.xml", ProtocolEncoding +=> 'UTF-8'); # Print doc file $doc->printToFile ("c:\\accentTestOutPut.xml"); #re-open file in UTF-8 encoded filehandle open my $fh, ">:utf8", "accentTestOutPut.xml" or die $!; $doc->print($fh); # cleanup $doc->dispose;
#!/bin/perl -w use XML::DOM; my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ("c:\\accentTest.xml", ProtocolEncoding +=> 'UTF-8'); # Print doc file $doc->printToFile ("c:\\accentTestOutPut.xml"); #re-open file in UTF-8 encoded filehandle open my $fh, ">:utf8", "accentTestOutPut.xml" or die $!; print $fh "\x{FEFF}"; # BOM $doc->print($fh); # cleanup $doc->dispose;
#!/bin/perl -w use XML::DOM; use UTF8BOM; my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ("c:\\accentTest.xml", ProtocolEncoding +=> 'UTF-8'); # Print doc file $doc->printToFile ("c:\\accentTestOutPut.xml"); UTF8BOM->insert_into_file('c:\\accentTestOutPut.xml'); # cleanup $doc->dispose;
#!/bin/perl -w use XML::DOM; use Encode qw(encode_utf8); my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ("c:\\accentTest.xml", ProtocolEncoding = +> 'UTF-8'); # Print doc file $doc->printToFile ("c:\\accentTestOutPut.xml"); open my $fh, ">:utf8", "accentTestOutPut.xml" or die $!; encode_utf8($fh); $doc->print($fh); # cleanup $doc->dispose;
#!/bin/perl -w use XML::DOM; use PerlIO::encoding; my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ("c:\\accentTest.xml", ProtocolEncoding = +> 'UTF-8'); # Print doc file $doc->printToFile ("c:\\accentTestOutPut.xml"); open my $fh, ">:encoding(UTF-8)", "accentTestOutPut.xml" or die $!; $doc->print($fh); # cleanup $doc->dispose;

What I have also discovered is that changing the 1st line of the XML to <?xml version="1.0" encoding="windows-1252"?> (as suggested by ikegami) in all cases results in me being able to open the file OK in windows.


In reply to Re^8: XML:: DOM and Accented Characters by freeflyer
in thread XML:: DOM and Accented Characters by freeflyer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.