It's unclear to me whether by "multiple schemas" you mean validating one XML file against multiple different schemas, or whether it's one Schema file that includes other Schema files. Could you show a short, complete example, with simple XSD files that represent what you're trying to do? Please see Short, Self-Contained, Correct Example.

The following works for me.

schema.xsd:

<?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com" xmlns:foo="http://www.example.com" elementFormDefault="qualified"> <include schemaLocation="included.xsd" /> <element name="hello"> <complexType> <sequence> <element name="world" type="foo:worldType" /> </sequence> </complexType> </element> </schema>

included.xsd:

<?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:foo="http://www.example.com" elementFormDefault="qualified"> <import namespace="http://www.w3.org/1999/xhtml" schemaLocation= "http://www.w3.org/2002/08/xhtml/xhtml1-transitional.xsd" /> <complexType name="worldType"> <complexContent> <extension base="xhtml:Flow"> <attribute name="foo" type="string" use="required" /> </extension> </complexContent> </complexType> </schema>

Code - Note it was necessary to use XML::LibXML::externalEntityLoader() instead of $parser->input_callbacks(), because I didn't see another way for the callbacks to affect XML::LibXML::Schema.

use warnings; use strict; use utf8; use XML::LibXML; use URI; use HTTP::Tiny; my $http = HTTP::Tiny->new; my %cache; XML::LibXML::externalEntityLoader(sub { my ($url, $id) = @_; die "Can't handle ID '$id'" if length $id; my $uri = URI->new($url); my $file; if (!$uri->scheme) { $file = $url } elsif ($uri->scheme eq 'file') { $file = $uri->path } if (defined $file) { warn "'$uri' => Loading '$file' from disk\n"; #Debug open my $fh, '<', $file or die "$file: $!"; my $data = do { local $/; <$fh> }; close $fh; return $data; } # else die "Can't handle URL scheme: ".$uri->scheme unless $uri->scheme=~/\Ahttps?\z/i; if (!defined $cache{$uri}) { warn "'$uri' => Fetching...\n"; #Debug my $resp = $http->get($uri); die "$uri: $resp->{status} $resp->{reason}\n" unless $resp->{success}; $cache{$uri} = $resp->{content}; } else { warn "'$uri' => Cached\n"; } #Debug return $cache{$uri}; }); print "Loading schema...\n"; my $xsd = XML::LibXML::Schema->new( location => 'schema.xsd' ); my @xmls = (<<'END_XML_ONE',<<'END_XML_TWO',<<'END_XML_THREE'); <?xml version="1.0" encoding="UTF-8"?> <hello xmlns="http://www.example.com"> <world foo="bar"> <p xmlns="http://www.w3.org/1999/xhtml"> <i>x</i> </p> </world> </hello> END_XML_ONE <?xml version="1.0" encoding="UTF-8"?> <hello xmlns="http://www.example.com"> <world> <p xmlns="http://www.w3.org/1999/xhtml"> <i>x</i> </p> </world> </hello> END_XML_TWO <?xml version="1.0" encoding="UTF-8"?> <hello xmlns="http://www.example.com"> <world foo="bar"> <p xmlns="http://www.w3.org/1999/xhtml"> <foo>x</foo> </p> </world> </hello> END_XML_THREE my $i = 1; for my $xml (@xmls) { print "Validating XML #$i...\n"; my $doc = XML::LibXML->load_xml( string => $xml ); if ( eval { $xsd->validate($doc); 1 } ) { print "=> Valid!\n" } else { print "=> Invalid! $@" } } continue { $i++ }

Output:

Loading schema... 'schema.xsd' => Loading 'schema.xsd' from disk 'included.xsd' => Loading 'included.xsd' from disk 'http://www.w3.org/2002/08/xhtml/xhtml1-transitional.xsd' => Fetching. +.. 'http://www.w3.org/2001/xml.xsd' => Fetching... Validating XML #1... => Valid! Validating XML #2... => Invalid! unknown-137e570:0: Schemas validity error : Element '{http +://www.example.com}world': The attribute 'foo' is required but missin +g. Validating XML #3... => Invalid! unknown-137e570:0: Schemas validity error : Element '{http +://www.w3.org/1999/xhtml}foo': This element is not expected. Expected + is one of ( {http://www.w3.org/1999/xhtml}a, {http://www.w3.org/1999 +/xhtml}br, {http://www.w3.org/1999/xhtml}span, {http://www.w3.org/199 +9/xhtml}bdo, {http://www.w3.org/1999/xhtml}object, {http://www.w3.org +/1999/xhtml}applet, {http://www.w3.org/1999/xhtml}img, {http://www.w3 +.org/1999/xhtml}map, {http://www.w3.org/1999/xhtml}iframe, {http://ww +w.w3.org/1999/xhtml}tt ).

And just for the sake of completeness, here's the original code I posted on StackOverflow that uses an XML::LibXML::InputCallback:

use warnings; use strict; use XML::LibXML; use HTTP::Tiny; use URI; my $parser = XML::LibXML->new; my $cb = XML::LibXML::InputCallback->new; my $http = HTTP::Tiny->new; my %cache; $cb->register_callbacks([ sub { 1 }, # match (URI), returns Bool sub { # open (URI), returns Handle my $uri = URI->new($_[0]); my $file; #warn "Handling <<$uri>>\n"; #Debug if (!$uri->scheme) { $file = $_[0] } elsif ($uri->scheme eq 'file') { $file = $uri->path } elsif ($uri->scheme=~/\Ahttps?\z/i) { if (!defined $cache{$uri}) { my $resp = $http->get($uri); die "$uri: $resp->{status} $resp->{reason}\n" unless $resp->{success}; $cache{$uri} = $resp->{content}; } $file = \$cache{$uri}; } else { die "unsupported URL scheme: ".$uri->scheme } open my $fh, '<', $file or die "$file: $!"; return $fh; }, sub { # read (Handle,Length), returns Data my ($fh,$len) = @_; read($fh, my $buf, $len); return $buf; }, sub { close shift } # close (Handle) ]); $parser->input_callbacks($cb); my $doc = $parser->load_xml( IO => \*DATA ); print "Is valid: ", $doc->is_valid ? "yes" : "no", "\n"; __DATA__ <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "https://www.nc +bi.nlm.nih.gov/projects/linkout/doc/LinkOut.dtd" [ <!ENTITY base.url "https://some.domain.com"> <!ENTITY icon.url "https://some.domain.com/logo.png"> ]> <LinkSet> <Link> <LinkId>1</LinkId> <ProviderId>XXXX</ProviderId> <IconUrl>&icon.url;</IconUrl> <ObjectSelector> <Database>PubMed</Database> <ObjectList> <ObjId>1234567890</ObjId> </ObjectList> </ObjectSelector> <ObjectUrl> <Base>&base.url;</Base> <Rule>/1/</Rule> </ObjectUrl> </Link> </LinkSet>

And finally, here's a variation of the caching code that uses an on-disk cache (Update: It's not perfect, because there's a tiny chance of filename collisions if clean_fragment happens to map two URLs to the same filename, but this is meant to be more of a proof-of-concept; there are plenty of other caching mechanisms available. Just one example, note how I used Memoize::Storable to cache the return values of the get_deps function here.):

my $CACHE_DIR = '/tmp/xmlcache'; use File::Path qw/make_path/; make_path($CACHE_DIR, {verbose=>1}); use URI; use HTTP::Tiny; use Text::CleanFragment qw/clean_fragment/; use File::Spec::Functions qw/catfile/; my $http = HTTP::Tiny->new; XML::LibXML::externalEntityLoader(sub { my ($url, $id) = @_; die "Can't handle ID '$id'" if length $id; my $uri = URI->new($url); my $file; if (!$uri->scheme) { $file = $url } elsif ($uri->scheme eq 'file') { $file = $uri->path } elsif ($uri->scheme=~/\Ahttps?\z/i) { # Note there is a (tiny) chance of filename collisions here! $file = catfile($CACHE_DIR, clean_fragment("$uri")); if (!-e $file) { warn "'$uri' => Mirroring to '$file'...\n"; #Debug my $resp = $http->mirror($uri, "$file"); die "$uri: $resp->{status} $resp->{reason}\n" unless $resp->{success}; } } else { die "Can't handle URL scheme: ".$uri->scheme } warn "'$uri' => Loading '$file' from disk\n"; #Debug open my $fh, '<', $file or die "$file: $!"; my $data = do { local $/; <$fh> }; close $fh; return $data; });

In reply to Re^3: Validating an XML file with multiple schemas by haukex
in thread Validating an XML file with multiple schemas by mart0000

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.