Can you show us a bit more of the code that calls LibXML? I don't know if you are using a streaming or DOM parser. It's obvious where the bottleneck is but I think seeing some more of the code would help diagnose the problem. If there is anything proprietary you can't show us then leave those parts out.