Thanks, but yes, I really want to use XML::LibXML. It's so much faster than HTML::TreeBuilder and speed is critical in my application. So far it's actually been pretty reliable - this problem only occurs in around 1 out of every 100,000 or so pages I've parsed.
-sam