mart0000 has asked for the wisdom of the Perl Monks concerning the following question:

I've used XML::LibXML::Schema to successfully validate a single XML file against a single XSD schema file. Has anyone used the module to successfully validate against multiple XSD schemas, where the 1st schema depends on a 2nd, and the 2nd on a 3rd and so on ... If XML::LibXML::Schema is not the suitable module for this purpose, what's the success rate with the alternatives ?
  • Comment on Validating an XML file with multiple schemas

Replies are listed 'Best First'.
Re: Validating an XML file with multiple schemas
by haukex (Archbishop) on Jan 04, 2019 at 08:13 UTC
    If XML::LibXML::Schema is not the suitable module for this purpose, what's the success rate with the alternatives ?

    XML::LibXML is an interface to libxml2, which is a very powerful library. I'd be surprised if it couldn't handle it, so I'd try going with it first. Sometimes it takes a bit of tweaking, for example here I showed how to write custom code to cache external resources during DTD validation. If you have trouble with it, feel free to report back here with an SSCCE that reproduces the issue.

      Thank you for your quick response ! Please beware that my Perl skills are in need of severe repair, so my responses will reflect that. From looking at your example, I'm trying to figure out how to appropriate the idea of caching a list of schema file content, similar to caching the DTD (or DTDs) from external resources referenced by the URIs within an HTML,XHTML,XML,... After I read the Schema.pod packaged with XML-LibXML-2.0132, it sounded like libxml2's support for handling W3C Schema may not be as mature as the DTDs. I hope I'm very much mistaken.

      Let me give you a brief example of the single schema usage (minus the error handling and debug) :

      package example; use XML::LibXML; use strict; use warnings; my $xmlFilePath = <local file path>; my $xsdFilePath = <local file path>; my $document = XML::LibXML->load_xml( location => $xmlFilePath ); my $schema = XML::LibXML::Schema->new( location => $xsdFilePath ); $schema->validate( $document );

      Simple & short. To extend the above for multiple schemas, and keep the same feel, I'll want to build something that allows the following usage:

      : : my $schema = XML::LibXML::Schema->new( location => $xsdFile1Path ); $schema->add( location => $xsdFile2Path ); $schema->add( location => $xsdFile3Path ); : :

      Or something similar and grammatically accurate. As long as the library internally has the mechanism to support the dependencies between the schemas themselves, it shouldn't be too complicated to extend XML::LibXML::Schema and take advantage. However, if it were so, I'd imagine the author would have already made an attempt.

      Do you still think I'll be successful in reusing your idea to achieve the above ?

        It's unclear to me whether by "multiple schemas" you mean validating one XML file against multiple different schemas, or whether it's one Schema file that includes other Schema files. Could you show a short, complete example, with simple XSD files that represent what you're trying to do? Please see Short, Self-Contained, Correct Example.

        The following works for me.

Re: Validating an XML file with multiple schemas
by Veltro (Hermit) on Jan 10, 2019 at 18:26 UTC

    Hello mart0000,

    As people have already indicated in all the posts here, it seems that libxml has some limitations. Also googling for this confirms this as well (such as this google result)

    Valid things that should be possible aren't, such as adding extra references to the xsi:schemaLocation element:

    xsi:schemaLocation="urn:tempuri:Personal personal.xsd urn:tempuri:Contact email.xsd urn:tempuri:Contact address.xsd"

    or adding lines to personal.xsd:

    <import schemaLocation="address.xsd" namespace="urn:tempuri:Contac +t"/> <import schemaLocation="email.xsd" namespace="urn:tempuri:Contact" +/>

    Multiple imports with the same namespace doesn't work

    But, after some experimenting I believe I have found a a workaround. This workaround however is only going to work if you have access to the definition of personal.xsd, so I hope that that is the case:

    Step 1: Create a new xsd file that 'includes' address.xsd and email.xsd with the filename 'contact.xsd' as follows:

    <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:Contact="urn:tempuri:Contact" targetNamespace="urn:tempuri:Contact" elementFormDefault="unqualified"> <xs:include schemaLocation="address.xsd"/> <xs:include schemaLocation="email.xsd"/> </xs:schema>

    Step 2: Change personal.xsd by adding the import line. I also added xmlns:con=..., but not sure if that was needed:

    <?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:per="urn:tempuri:Personal" xmlns:con="urn:tempuri:Contact" targetNamespace="urn:tempuri:Personal" elementFormDefault="unqualified"> <import schemaLocation="contact.xsd" namespace="urn:tempuri:Contac +t"/> <element name="PersonalInfo"> <complexType> <sequence> <element name="FirstName" type="string"/> <element name="LastName" type="string"/> <element name="Contact" type="per:ContactType"/> </sequence> </complexType> </element> <complexType name="ContactType"> <sequence> <any namespace="##other" processContents="strict" maxOccurs="unbounded"/> </sequence> </complexType> </schema>

    Step 3: Change the xml files so that they will resemble the following:

    <?xml version="1.0" encoding="UTF-8"?> <pinfo:PersonalInfo xmlns:pinfo="urn:tempuri:Personal" xmlns:cinfo="urn:tempuri:Contact" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:tempuri:Personal personal.xsd"> <FirstName>First Name</FirstName> <LastName>Last Name</LastName> <Contact> <cinfo:Address> <Street>Main Street</Street> <City>Main City</City> </cinfo:Address> </Contact> </pinfo:PersonalInfo>

    Even though I don't think that anything is done with xsi:schemaLocation I left it there for completeness

    Hope this helps,

    Veltro