Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

In this XML, I'm wanting only the elements outside of the breakfast menu. parentNode returns entire xml.
<?xml version="1.0" encoding="UTF-8"?> <restaurant> <timestart>0600</timestart> <timeend>1030</timeend> <city>Boston</city> <country>US</country> <breakfast_menu> <food> <name>Belgian Waffles</name> <price>$5.95</price> <description>Two of our famous Belgian Waffles with pl +enty of real maple syrup</description> <calories>650</calories> </food> <food> <name>Strawberry Belgian Waffles</name> <price>$7.95</price> <description>Light Belgian waffles covered with strawb +erries and whipped cream</description> <calories>900</calories> </food> </breakfast_menu> </restaurant>
my $root = $dom->getDocumentElement(); foreach my $node ($root->findnodes('//breakfast_menu')) { print $node->parentNode(); }

Replies are listed 'Best First'.
Re: XML::LibXML parentNode only root element
by ikegami (Patriarch) on Apr 24, 2023 at 15:23 UTC

    It doesn't. It returns a single restaurant node. But you proceed to print it and all of its descendants.

    To get the desired nodes, you can use the following:

    my @nodes = $doc->findnodes( '/restaurant/*[ not( self::breakfast_menu ) ]' );

    Maybe you'll find it more useful to simply delete the offending node.

    $_->unbindNode() for $doc->findnodes( "/restaurant/breakfast_menu" );
Re: XML::LibXML parentNode only root element
by choroba (Cardinal) on Apr 24, 2023 at 17:33 UTC
    You can use the following-sibling and preceding-sibling axes.
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML; my $dom = 'XML::LibXML'->load_xml(location => shift); my $root = $dom->getDocumentElement(); my ($bfm) = $root->findnodes('//breakfast_menu'); say for $bfm->findnodes('(following-sibling::* | preceding-sibling::*) +');
    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: XML::LibXML parentNode only root element
by kcott (Archbishop) on Apr 24, 2023 at 21:48 UTC

    Or, keeping the xpath expression simple, plus testing element after breakfast menu:

    #!/usr/bin/env perl use strict; use warnings; use XML::LibXML; my $dom = XML::LibXML->load_xml(string => <<'EOT'); <?xml version="1.0" encoding="UTF-8"?> <restaurant> <timestart>0600</timestart> <timeend>1030</timeend> <city>Boston</city> <country>US</country> <breakfast_menu> <food> <name>Belgian Waffles</name> <price>$5.95</price> <description>Two of ...</description> <calories>650</calories> </food> <food> <name>Strawberry Belgian Waffles</name> <price>$7.95</price> <description>Light Belgian ...</description> <calories>900</calories> </food> </breakfast_menu> <edgecase>test after breakfast</edgecase> </restaurant> EOT my $root = $dom->getDocumentElement(); for ($root->findnodes('/restaurant/*')) { next if 0 == index($_, '<breakfast_menu>'); print "$_\n"; }

    Output:

    <timestart>0600</timestart> <timeend>1030</timeend> <city>Boston</city> <country>US</country> <edgecase>test after breakfast</edgecase>

    Note: If you only want elements before breakfast menu, change next to last in the for loop.

    — Ken

      You used

      next if 0 == index( $_, '<breakfast_menu>' );

      That would fail if it the serialization has any attributes or a prefix. Also, it completely ignores the namespace. The correct way to do the check outside of the XPath:

      next if $_->nodeName eq "breakfast_menu" && !defined( $_->namespaceURI );