in reply to xml validation using perl

First of all it would help if you would say what parser you use...

Secondly, as a general note to your (general) title: validating XML with Perl is a bit tricky. It depends for instance what level of validation you want to perform. Most of the Perl solutions support validation using DTD's. This gives you only (very) limited possibilities for validating your XML documents. DTD's were the first attempt for validating XML documents and IMHO not a very good one. There are certain disadvantages using DTD's

Several schema languages were designed to counter these problems (Schematron, relaxng). The most popular and best supported is W3C XML Schema. It is safe to say this isn't exactly perfect either but many of the issues with DTD's were more or less solved. XML Schema gives you much more power to validate documents at the cost of added complexity, i.e. XML Schema is not simple, there is a learning curve. Furthermore I know of no Perl solution that fully supports XML Schema, as I recall it only subsets are supported. Depending on what you want to do this might work for you. I have my own set of tools to work with XML, if I would have to pick a Perl solution it would probably be XML::Xerces but I'm biased towards Xalan/Xerces.

NB In the course of time some very clever/tricky solutions were devised to counter the problems with DTD's, e.g. adding datatypes but it never worked out that well.

Cheers,

Harry

Replies are listed 'Best First'.
Re^2: xml validation using perl
by ikegami (Patriarch) on Feb 08, 2010 at 05:52 UTC

    Furthermore I know of no Perl solution that fully supports XML Schema,

    I haven't encountered any limitations with XML::LibXML. See XML::LibXML::Schema

      My libxml/XML Schema experience stems from some years ago. At that time it didn't really work. On the http://xmlsoft.org I read things like:

      A partial implementation of XML Schemas Part 1: Structure is being worked on but it would be far too early to make any conformance statement about it at the moment.

      And that it doesn't claim to completely implement: DOM and SAX?! This scares me away. I use Xalan/Xerces because I have good experience with it and they fully implement most of the W3C recommendations, see for example Xerces features. Of course there might be some exotic features I'll probably never use, still I'd like to make the choice myself.

      Cheers

      Harry