dallen16 has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a script (package method) to read XML documents per the GraphML schema. I'm using XML::LibXML for the first time and need help understanding XPath queries with documents that use namespaces. See GraphML Primer

The GraphML schema includes a namespace attribute without a prefix as follows.

<graphml xmlns="http://graphml.graphdrawing.org/xmlns" [more gorp] >

Recursively traversing the document tree using the "nonBlankChildNodes" method works fine. But from any node/element in the traversal, the findNodes method (called from the retrieved node/element) with an XPath query doesn't work (never returns any found nodes). I assume this is a namespace issue?

Alternatively, creating an XPathContext object and then registering the namespace (defining a prefix) does enable the use of the findNodes method but only using that $xpc object as follows.

my $xpc = XML::LibXML::XPathContext->new($dom); $xpc->registerNs('gml', 'http://graphml.graphdrawing.org/xmlns'); if (my $graphNode = @{$xpc->findnodes('/gml:graphml/gml:graph')}[0] ) +{ if ($graphNode->hasAttributes()) { my $directed = $graphNode->getAttribute('edgedefault') || ''; print "edgedefault = '$directed'\n"; } } else { print "did not find graph node edgedefault value\n"; }

Once I get to a particular node (eg, an "edge" node that may contain a child "data" element), calling that node's findNodes method to find a specific child element using an XPath query doesn't work (never returns any matching nodes/elements). If I understand correctly, from a given node, it should be possible to find descendant nodes using an XPath query of the form ".//data" (where data is a node name).

The alternative that does work is to call the $xpc object's findNodes method adding the current (eg., edge) node as the context node as follows.

my $weightXPATH = ".//gml:data[\@key=\"$weightKey\"]"; if (my $dataWeightNode = @{$xpc->findnodes($weightXPATH,$edgeElement)} +[0]) { $weight = $dataWeightNode->textContent(); }

In essence, all calls to findNodes must be called from the original $xpc object optionally using a retrieved node as the context node. Am I missing something fundamental or doing something wrong? Or is this the correct use of XML::LibXML for documents with XML namespaces?

Many thanks for shared wisdom and guidance.

Replies are listed 'Best First'.
Re: XML::LibXML with namespace with no prefix
by choroba (Cardinal) on Dec 03, 2015 at 16:29 UTC
    Yes, using XML::LibXML::XPathContext and context nodes is the common way how to handle namespaces in XML::LibXML.

    BTW, you can retrieve the first element with

    if (my $dataWeightNode = $xpc->findnodes($weightXPATH, $edgeElement)-> +[0]) {
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Thank you. And good catch on my sloppy / ugly dereferncing of the array reference. Updated my code.

Re: XML::LibXML with namespace with no prefix
by Anonymous Monk on Dec 04, 2015 at 00:32 UTC

    You can write xpaths like this

    my $firstWeightXpath = qq{ //*:data[ attribute:key="$weightKey" ] [0] };

    I use this helper , allows you to use it on dom and nodes, like

    $dom->F( '//what:ever', 'what' => 'urn:yeahwhatever', 'that' => 'urn:thatalso', )->[0]->F('//what:else/that:there'); sub XML::LibXML::Node::F { my $self = shift; my $xpath = shift; my %prefix = @_; our $XPATHCONTEXT; $XPATHCONTEXT ||= XML::LibXML::XPathContext->new(); while( my( $p, $u ) = each %prefix ){ $XPATHCONTEXT->registerNs( $p, $u ); } $XPATHCONTEXT->findnodes( $xpath, $self ); }

    Example of usage at Re^2: XML namespace question

    I also use xpather.pl

      Thank you. If I understand, your suggested XPath uses a wildcard for the namespace prefix, the "*" in "//*:data". Will give this a try. I think I follow the helper logic and will give it a try as well.

        . If I understand, your suggested XPath uses a wildcard for the namespace prefix, the

        Correct, thats what I said, but I got confused and I made a mistake

        This will work for sure

        //*[ local-name() = 'data' and attribute:key="$weightKey" ] [0]