in reply to Re^2: XML::SAX::PurePerl, handle entities
in thread XML::SAX::PurePerl, handle entities

That code doesn't run. Something about $parser not having a value. You'd think it would be a XML::SAX::PurePerl object, but it doesn't have a parse_string method. What are you talking about?!

And why are you trying to redefine < and >? Those are two of the four entities XML understand nativly. I've even shown that XML::SAX::PurePerl handles them.

Update: Must have run a bad test. It does indeed have parse_string.

Replies are listed 'Best First'.
Re^4: XML::SAX::PurePerl, handle entities
by VirtualRider (Initiate) on Aug 12, 2010 at 18:13 UTC

    parse_string is similar to parse_uri - if i remember correctly they are defined by XML::SAX::Base. I don't wanna redefine lt and gt, they were just in a set of entities i need to replace/handle and i removed the native-xml-entities from that set so this isn't a problem.

    If there is no way to send unkown entities to an handler/method, i need to generate the doctype-entity-section at runtime and deal with them after parsing - i was just looking for a way to do this at parse-time/avoid the doctype-section-generation

    The end-tag-mismatch is a major problem and could have something to do with the size of the xml file (the tags are correct), but i wont be able to have a look on it until tomorrow

    #! /usr/bin/perl use strict; use warnings; BEGIN { package MySAXHandler; use parent 'XML::SAX::Base'; sub start_element { print "element $_[1]{Name}\n"; } sub end_element { print "element end\n"; } sub characters { print "text $_[1]{Data}\n"; } } use XML::SAX::PurePerl; my $parser = XML::SAX::PurePerl->new( Handler => MySAXHandler->new(), ); $parser->parse_string("<?xml version=\"1.0\"?> <!DOCTYPE root [ <!ENTITY lt \"<\"> <!ENTITY gt \">\"> ]> <root>&lt;SOMETHING&gt;</root>");

      Sorry about parse_string

      If there is no way to send unkown entities to an handler/method, i need to generate the doctype-entity-section at runtime

      There's no such thing as an unknown entity. Entities must be declared for the document to be valid. Usually, they are declared in an external DTD, but if you want to include the declaration inside the document, that's your prerogative.

      That said, parsers usually provide hooks to avoid the need to download and parse DTD. I don't know if XML::SAX and XML::SAX::PurePerl offer any.

      That said, lt and gt aren't unknown entities, and it's definitely not safe to change their meaning as you are doing. In fact, you are causing the very problem you asked about. As previously demonstrated in one post and previously mentioned in another, don't do that.