worik has asked for the wisdom of the Perl Monks concerning the following question:

I need to compare XML that may come from different sources where name space prefixes will be defined separately and perhaps differently.

#!/usr/bin/perl -w use XML::LibXML; my $parser = XML::LibXML->new(); my $xmls1 = '<?xml version="1.0" encoding="utf-8" ?> '. '<LIST '. 'xmlns:Z1="TheNameSpace"> '. '<Z1:Authors/>'. '</LIST>'; my $xmls2 = '<?xml version="1.0" encoding="utf-8" ?> '. '<LIST '. 'xmlns:Z2="TheNameSpace"> '. '<Z2:Authors/>'. '</LIST>'; my $xml1 = $parser->parse_string($xmls1)->documentElement(); my $xml2 = $parser->parse_string($xmls2)->documentElement(); my @c1 = $xml1->nonBlankChildNodes(); my $c1 = $c1[0]; my @c2 = $xml2->nonBlankChildNodes(); my $c2 = $c2[0]; my $c1ref = ref($c1); my $c2ref = ref($c2); print "\$c1ref $c1ref \$c2ref $c2ref\n"; print $c1->toString()."\n"; print $c2->toString()."\n"; my $ns1 = $c1->namespaceURI(); my $ns2 = $c2->namespaceURI(); my $nn1 = $c1->nodeName(); my $nn2 = $c2->nodeName(); print "\$ns1 $ns1\n\$ns2 $ns2\n\n"; print "\$nn1 $nn1\n\$nn2 $nn2\n\n"; print $c1->isEqual($c2)?"Equal\n":"Not\n";

$c1 and $c2 should be the same. I could compare the namespaces from namespaceURI() and strip off the prefix from nodeName(), that would work, but there must be a better way. Is there?

BTW the output I get from that code is:

$c1ref XML::LibXML::Element $c2ref XML::LibXML::Element <Z1:Authors/> <Z2:Authors/> $ns1 TheNameSpace $ns2 TheNameSpace $nn1 Z1:Authors $nn2 Z2:Authors Not

Replies are listed 'Best First'.
Re: Comparing nodes in LibXML
by tangent (Parson) on Sep 03, 2015 at 02:00 UTC
    If you look in the docs for XML::LibXML::Node you will see:
    isEqual deprecated version of isSameNode(). isSameNode returns TRUE (1) if the given nodes refer to the same node structure, +otherwise FALSE (0) is returned.
    In other words, it checks to see if it is the very same node, so even if you change the prefix of the second element (in $xmls2) to Z1, isEqual() will still return false. I assume isEqual() has been deprecated because of this.

    You could have a look at localname() in the XML::LibXML::Node docs, and setNamespaceDeclURI() and setNamespaceDeclPrefix() in the XML::LibXML::Element docs.
Re: Comparing nodes in LibXML
by worik (Sexton) on Sep 03, 2015 at 01:59 UTC

    After lunch I hacked this up that solves my problem. But there must be a better way...

    sub _cmpXML( $$ ){ # Pass in two XML nodes and return 0 if they are the same, else # lexographically compare the namespaces fist then the node names # (with out prefexes) my $n1 = shift or die; my $n2 = shift or die; my $ns1 = $n1->namespaceURI(); my $ns2 = $n2->namespaceURI(); $ns1 ne $ns2 and return $ns1 cmp $ns2; my $nn1 = $n1->nodeName(); my $nn2 = $n2->nodeName(); $nn1 =~ s/^[^\:]*:?//; $nn2 =~ s/^[^\:]*:?//; return $nn1 cmp $nn2; }

    There must be a built in for doing this, it is a bit inconceivable to me that this is not a common requirement

      my $nn1 = $c1->localname(); my $nn2 = $c2->localname(); print "\$nn1 $nn1\n\$nn2 $nn2\n\n"; Output: $nn1 Authors $nn2 Authors

        localname is helpful. I missed it in the documentation. But it is only half the picture as it strips namespace information. Evven though it seems that namespaces are an abomination in XML I need to account for them. I am not generating the XML I have to eat.