in reply to Re^3: XML:: DOM and Accented Characters
in thread XML:: DOM and Accented Characters
Thanks for the help but I'm still unable to get it to work even after adding the BOM, although I am learning along the way
I'm now using both TextPad and NotePad++ (with plugin) to view the codes for the output file (accentTestOutput.xml). I've also run it on both my work and home pc's - both running Windows.
After running the code provided by almut I'm still not seeing C3 A9 as the hex code for the e-acute. TextPad is displaying an E9 code and NotePad++ EF BF BD. It also looks as if the BOM is not there, I am unable to see the code EF BB BF at the start of the file (which is what I should see right?).
Using the package UTF8BOM to insert the BOM I can see the BOM is there in both cases (TextPad and NotePad++) due to seeing EF BB BF at the start of the file. However both programs now display E9 as the code for the e-acute not the C3 A9 I'm looking for.
Incidently at no point have I been able to open the output file in Internet Explorer, It complains of an invalid character at the point of the e-acute.
Here's the output after trying to insert the BOM using
print $fh "\x{feff}";TextPad
0: 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E 3D 22 31 <?xml version="1 10: 2E 30 22 20 65 6E 63 6F 64 69 6E 67 3D 22 55 54 .0" encoding="UT 20: 46 2D 38 22 3F 3E 0D 0A 3C 54 45 53 54 3E 20 E9 F-8"?>..<TEST> é 30: 20 3C 2F 54 45 53 54 3E 0D 0A </TEST>..
NotePad++
3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 46 2d 38 22 3f 3e 0d 0a 3c 54 45 53 54 3e 20 ef bf bd 20 3c 2f 54 45 53 54 3e 0d 0a
Here's the output after trying to insert the BOM using the UTF8BOM perl package using
UTF8BOM->insert_into_file('c:\\accentTestOutPut.xml');You can see the BOM code at the begining of the file
TextPad
0: EF BB BF 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E <?xml version 10: 3D 22 31 2E 30 22 20 65 6E 63 6F 64 69 6E 67 3D ="1.0" encoding= 20: 22 55 54 46 2D 38 22 3F 3E 0D 0A 3C 54 45 53 54 "UTF-8"?>..<TEST 30: 3E 20 E9 20 3C 2F 54 45 53 54 3E 0D 0A > é </TEST>..
NotePad++
ef bb bf 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 46 2d 38 22 3f 3e 0d 0a 3c 54 45 53 54 3e 20 e9 20 3c 2f 54 45 53 54 3e 0d 0a
I'm at the edge of what I know so don't really know where to go from here. I appreciate the help you given, any other ideas? If I've missed out some info that may be useful let me know.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^5: XML:: DOM and Accented Characters
by Pickwick (Beadle) on Aug 07, 2010 at 15:26 UTC | |
by freeflyer (Novice) on Aug 07, 2010 at 18:07 UTC | |
by Pickwick (Beadle) on Aug 08, 2010 at 12:59 UTC | |
by freeflyer (Novice) on Aug 09, 2010 at 08:59 UTC | |
by almut (Canon) on Aug 09, 2010 at 11:44 UTC | |
| |
Re^5: XML:: DOM and Accented Characters
by Anonymous Monk on Aug 07, 2010 at 11:41 UTC | |
by freeflyer (Novice) on Aug 07, 2010 at 12:14 UTC | |
by Anonymous Monk on Aug 07, 2010 at 12:25 UTC | |
by freeflyer (Novice) on Aug 07, 2010 at 12:55 UTC | |
by Anonymous Monk on Aug 07, 2010 at 13:02 UTC |