norricorp has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to understand the above module. I run the following code
foreach $node ($dom->findnodes('//*')) { if ($node->nodeType() == 3) { print "found a text node\n"; } print "node name is ", $node->nodeName(), "\n"; if ($node->hasChildNodes() == 0) # FALSE { print "No children: node text is ", $node->textContent(), "\n" +; }
It does not work as expected. It is not picking up text nodes so <title>book title</title> shows title which has a text element ("book title") and so has children. But it is not picking up the text nodes. What am I doing wrong? regards, John

Replies are listed 'Best First'.
Re: XML::LibXML problem
by choroba (Cardinal) on Dec 29, 2011 at 11:03 UTC
    Instead of 3, use the exported constant:
    if ($node->nodeType == XML_TEXT_NODE) {
    "Text node" is not an element containing text. It is an abstract node in the XML DOM structure that corresponds to the text itself. Moreover, //* only matches elements, not text nodes. To match text nodes, use //text().
      Hi, thanks for the reply. The problem is I want both elements and text nodes. But having said that, the bigger problem I am having is just geting a single child level of elements. This particular schema is literally dozens of levels deep but I only want the children under a particular element. So far, this query produces output (but all of the elements beneath the starting element) (I just want a single layer)
      $query = $root->nodePath() . $extendPath . "[\@name]";
      but this query
      $query = "child::" . $root->nodePath() . $extendPath . "[\@name]";
      produces an error of (please excuse the path - it is the schema, not me!) XPath error : Invalid expression child::/xs:schema/xs:element/xs:complexType/xs:sequence/xs:element/xs:complexType/xs:sequence/xs:element@name ^ at dump2.pl line 85 Any ideas? Regards, John
        I do not get your exact problem. To get children elements of a particular parent element, just do
        $parent->findnodes("*")
        To get both elements and text nodes, do
        $parent->findnodes("* | text()")
        To get text nodes and elements containing text, do
        $parent->findnodes("*[text()] | text()")
        And so on.
Re: XML::LibXML problem
by Khen1950fx (Canon) on Dec 29, 2011 at 10:19 UTC
    This should work better for you:
    #!/usr/bin/perl -l use strict; use warnings; use XML::LibXML qw(:all); my $dom = XML::LibXML->load_xml(string => <<'EOT'); <title>Tale of Two Cities</title> EOT foreach my $node ($dom->findnodes('//*')) { print "node name is ", $node->nodeName(); if ($node->hasChildNodes == 0) { print "No children."; } if ($node->textContent) { print "node text is: ", $node->textContent; } }
    Update: fixed typo.
Re: XML::LibXML problem
by thomas895 (Deacon) on Dec 29, 2011 at 09:58 UTC

    I am unclear on what you are trying to accomplish. Could you please provide an example of the XML you are trying to parse? Or perhaps how the XML::LibXML::Document object is set up?

    Side note: Maybe it is just that I do not understand this module fully, but where does the "3" come from, where you compare the nodeType(3rd line of your code)?

    ~Thomas~