Thanks to all for your help, and the different approach was what I needed. I can't change IBM's product so I took a slightly different angle and simply replaced encoding="UTF-8" with encoding="Latin1" and that allowed LibXML to parse the doc correctly.