in reply to XML parsing vs regex

It depends on the complexity and volatility of your XML.

I'm using myself a regexes for parsing pdftohtml -xml output and never regretted it. (very simple and stable format)

We don't have the necessary background informations to judge.

Anyway be careful with your regex, at least a non-greedy quantifier .*? and some tests on plausibility of the result would certainly help making your code more robust!

Cheers Rolf

( addicted to the Perl Programming Language)