water has asked for the wisdom of the Perl Monks concerning the following question:

We're using XML::Twig to parse large XML files received from another source. The files are well-formed and valid, but the files lack a DOCTYPE line. We have the correct DTD in another file.

We'd like the benefits of validation. Do I need to insert the DOCTYPE in the XML files to get this? Or is there a way to tell Twig "here's a file, it has no DOCTYPE, but please validate it using the DTD in this file over there"?

The alternative is rewrite each file with the one extra line tucked in at the top, but the files are big, and we'd prefer not to rewrite and shuffle files.

Thanks for any suggestions.

water

  • Comment on XML::Twig -- validating files lacking DOCTYPE

Replies are listed 'Best First'.
Re: XML::Twig -- validating files lacking DOCTYPE
by Aristotle (Chancellor) on Mar 20, 2004 at 18:40 UTC
    I just had a quick glance at the XML::Twig doc and saw it mentions a $twig->set_doctype method. Isn't that what you're looking for?

    Makeshifts last the longest.

      In our experiments, it seemed set_doctype does just that -- it just names the doctype for the output XML -- but it did not validate the inbound XML to the DTD. If we're wrong on this, could you post a snippet on how to use set_doctype correctly? Thanks!
Re: XML::Twig -- validating files lacking DOCTYPE
by mirod (Canon) on Mar 22, 2004 at 17:22 UTC

    XML::Twig is based on expat, which is a non-validating parser. So it doesn't validate against a DTD.

    You can do the validation as a separate step, using a validating parser of your choice.