in reply to Re: Seeking for advice: XML parsing with special requirements
in thread Seeking for advice: XML parsing with special requirements [Solved]

Thank you very much for bothering!

I had some problems with XML::Parser:

If it sees unresolvable entities (which I admit is formally an error in the XML document), it calls the default handler regardless of what handlers you have installed. This makes things more difficult, but I could live with it (I already had changed my code accordingly).

The disqualifier is: In a handler, you get the original (unparsed) string by invoking the underlying expat instance via

$_[0] -> original_string

or

$_[0] -> recognized_string

That would be nice and easy in the first place, but in some cases, there is only rubbish in the respective string; this is true for nearly all of the declaration blocks (for example doctype declarations and attribute declarations). The expat documentation is explicitly confirming this observation; unfortunately, it's a thing I can't live with.

As far as I know, XML::Parser always is based on expat, but perhaps, I have misunderstood something. If the latter is the case, I would be grateful if somebody could show me how to use XML::Parser with another underlying parser.

Thank you very much,

Nocturnus