Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

problem using XML::SAX::ParserFactory

by moshkod (Sexton)
on Jul 21, 2005 at 10:50 UTC ( #476771=perlquestion: print w/replies, xml ) Need Help??

moshkod has asked for the wisdom of the Perl Monks concerning the following question:

I have a cgi program that gets an valid XML and then parses it using XML::SAX::ParserFactory, the parsing fails with the following error:
[Wed Jul 20 17:23:23 2005] [error] [Wed Jul 20 17:23:23 2005] -e: \n[Wed Jul 20 17:23:23 2005] -e: 500 Can't connect to (connect: Connection timed out)\n[Wed Jul 20 17:23:23 2005] -e: Handler couldn't resolve external entity at line 2, column 149, byte 171\n[Wed Jul 20 17:23:23 2005] -e: error in processing external entity reference at line 2, column 149, byte 171 at /exlibris/sfx_ver/sfx_version_3/app/perl-5.8.6/lib/site_perl/5.8.6/ +sun4-solaris/XML/ line 187\n
I believe that the parsing fails because it can not get to the DTD (since the server is behind a firewall) but it is not clear to me why it attempts to connect to the dtd at:
since XML::SAX::ParserFactory by default does not do any validation and in my program i did not change the default settings of the parser.

any ideas?
Thanks Dana

Replies are listed 'Best First'.
Re: problem using XML::SAX::ParserFactory
by arturo (Vicar) on Jul 21, 2005 at 13:17 UTC

    As is pointed out in the XML spec, DTDs aren't just for validation; they also tell you what the default value of attributes are, and potentially contain entity declarations. In a way, DTDs contain some of the document's information. So, because these things are needed to 'read' the document, conforming parsers are required to read them if they are declared -- whether or not the parser is validating.

    If you're sure the document's valid, you could strip out the DOCTYPE declaration before siccing your parser on it.

    update : an even better solution would be to get a local copy of the DTD and use catalog resolution to find it. A catalog resolver can tell a parser where to look for a DTD with a given public ID, irrespective of the system ID (the URL in the doctype declaration).

    If not P, what? Q maybe?
    "Sidney Morgenbesser"

      Dana, I found something interesting in the documentation.

      The solution is along the lines of what the other poster suggested.

      I think you can redirect the DTD lookup to a local file by using the resolve_entity method. It says that you "can use this method to redirect external system identifiers to secure and/or local URIs, to look up public identifiers in a catalogue, or to read an entity from a database or other input source (including, for example, a dialog box)."

      From CPAN

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://476771]
Approved by Tanalis
Front-paged by planetscape
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2023-06-08 17:52 GMT
Find Nodes?
    Voting Booth?
    How often do you go to conferences?

    Results (35 votes). Check out past polls.