in reply to Isn't there a NOPARSE option for XML::Parser ?

Is it stupid of me to expect HTML in my XML?

I suspect you wouldn't like my first response to that question :-) so I'll move rapidly on to the second...

The sort of thing you describe would not be XML - end of story. This does not mean that the thing you describe would have no value, simply that it would not be XML and therefore you could not use the myriad XML tools (eg: XML::Parser) to work with it.

On a more helpful note, you can embed a chunk of HTML in XML as a CDATA section like this:

<doc> <title>This is a test</title> <htmlstring><![CDATA[ <p>Here is some HTML<br> It has an IMG tag: <img src="logo.png"> and a BR tag<br> </p> ]]></htmlstring> </doc>

Edit: Oh and I also meant to mention that the XML::LibXML module has the ability to parse HTML so you might find that you can use it to work with your hybrid documents in an XMLish way.