in reply to Re^2: XML::Parser problems
in thread XML::Parser problems
Sorry to reply with a RTFM, but this is what the FM reads (emphasis added):
Char (Expat, String)
This event is generated when non-markup is recognized. The non-markup sequence of characters is in String. A single non-markup sequence of characters may generate multiple calls to this handler. Whatever the encoding of the string in the original document, this is given to the handler in UTF-8.
Note that AFAIK all XML parsers behave like this, to allow you to parse documents even if they contain chunks of texts are bigger than the available memory.
Also the XML::Parser review mentions this, and give you a way to get all the data.
Update: the Perl XML FAQ also mentions this.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: XML::Parser problems
by Hena (Friar) on Jul 01, 2005 at 08:21 UTC |