That is correct, the source xml is broken as it contains invalid characters, and it cannot even be rendered in firefox.
I would post the error message, but it is not really relevant to what I am trying to solve and it would take me a while to find the code and run it as well. In the meantime, I have changed to using HTML::Tokeparser as it seems to be less strict between what is well defined xml and a bunch of text with tags in it.
Thank you for your reply, and i will take onboard the suggestions to look up different encodings and see if one works.