fragtzack has asked for the wisdom of the Perl Monks concerning the following question:

Hi perl experts,
Please help with XML issue, am new to XML and XML::LibXML.
Given the following XML slurrped into $string :

<?xml version="1.0" encoding="UTF-8"?><br/> <scMetrics><br/> <objCount>15</objCount><br/> <size>1157</size><br/> <realsize>1034</realsize><br/> <metadatasize>25281</metadatasize><br/> <totalSize>26315</totalSize><br/> </scMetrics><br/>

And given the following code:

my $parser=XML::LibXML->new; my $doc = $parser->parse_string($string); foreach my $ten ($doc->findnodes('/scMetrics')){ print $ten->nodeName."\n"; foreach my $child ($ten->getChildnodes){ print "name ".$child->nodeName."\n"; print "text ".$child->textContent." \n"; #$sub_hash{$child->nodeName}=$child->textContent; } print "========================\n"; }

Produces the following output:

scMetrics
name text
text
name objCount
text 15
name text
text
name size
text 1157
name text
text
name realsize
text 1034
name text
text
name metadatasize
text 25281
name text
text
name totalSize
text 26315
========================


This output appears there is nodes named "text" with empty content.
What am I doing wrong in the XML::LibXML code?
Conon, thanks so much! I stripped the white space from the raw text s/^\s+//g before slurpping into $string abd that worked!
Thanks much, good weekend.

Replies are listed 'Best First'.
Re: XML::LibXML problem
by Corion (Patriarch) on Sep 19, 2014 at 16:28 UTC

    There are text nodes. Whitespace is significant with XML, and there is whitespace between the end of (for example) </metadatasize> and the next tag, <totalsize>.

    I'm not sure how you convince XML::LibXML to ignore the text() child nodes. Maybe using /scMetrics/* instead of manually visiting the children will give you only the "proper" XML nodes instead of also including the text children on /scMetrics.

Re: XML::LibXML problem
by tangent (Parson) on Sep 19, 2014 at 16:49 UTC
    Use 'nonBlankChildNodes' instead of 'getChildnodes'