XML dont include parent node


laziness, impatience, and hubris
	PerlMonks

XML dont include parent node

by zak_s (Initiate)

on Dec 10, 2013 at 16:59 UTC ( [id://1066468]=perlquestion: print w/replies, xml )

Need Help??

zak_s has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
im not an expert in perl but I have tried to find a way to get this done but not working well for me. what is the best xpath expression to get the element of a node but without including the parent tags ?

i.e
<Root>
  <Parent>
    <child1></child1>
    <child2></child2>
    <child3></child3>
  </Parent>
</Root>
[download]

im only interested in the 3 children within the parent tags, where each has a different name.
What im currently using is '//Parent' im using findnodes from ::LibXML

Comment on XML dont include parent node Download Code

Replies are listed 'Best First'.
Re: XML dont include parent node by tangent (Parson) on Dec 10, 2013 at 17:35 UTC
This is not a modified Xpath but shows how you can access the child nodes: use XML::LibXML; my $string = q\| <Root> <Parent> <child1>Child 1</child1> <child2>Child 2</child2> <child3>Child 3</child3> </Parent> </Root>\|; my $doc = XML::LibXML->load_xml(string => $string); my @nodes = $doc->findnodes('//Parent'); for my $node (@nodes) { my @childnodes = $node->childNodes or next; for my $cnode (@childnodes) { if ($cnode->nodeName =~ m/^child/) { print 'name: '. $cnode->nodeName . ', '; print 'content: '. $cnode->textContent ."\n"; } } } # Output: # name: child1, content: Child 1 # name: child2, content: Child 2 # name: child3, content: Child 3 [download] See XML::LibXML::Node	[reply] [d/l]
Re^2: XML dont include parent node by zak_s (Initiate) on Dec 10, 2013 at 17:51 UTC
Thanks this helps. one more question is there any way to ignore empty lines when copying nodes? Any expression I can use to delete all empty lines ?	[reply]
Re: XML dont include parent node by smls (Friar) on Dec 11, 2013 at 01:55 UTC
Based on your description I'm not 100% sure which of the following two things you want to achieve, so let me address both: A) Get a list of all child nodes of the `Parent` node One solution is to match the `Parent` node via an XPath expression, and then call the `childNodes` method on it (which is what tangent already suggested above). An alternative solution is to use an asterisk wildcard directly in the XPath expression, e.g. in your example the expression passed to `findnodes` would become `'//Parent/'`. However if there are multiple `Parent` nodes in the document, this would return all their children as one flat list, whereas tangent's solution allows you to handle each set separately. Another difference is that the asterisk expression only matches element* nodes, whereas the `childNodes` method also lists text or CDATA nodes (including whitespace strings in between the child elements, although there is an alternative method called `nonBlankChildNodes` which avoids that). If you are indeed only interested in the child elements, but want to process each set separately in case of multiple `Parent` nodes, you could either combine `childNodes` with a check for `nodeName` (like tangent's solution does), or use a stand-alone asterisk-query: `my $doc = XML::LibXML->load_xml( ... ); foreach my $parent ($doc->findnodes('//Parent')) { my @childElements = $parent->findnodes(''); # ...do stuff with @childElements... }` [download] B) Get a string serialization of the `Parent` node, but with the actual `Parent` start/end tags stripped... ...akin to the `.innerHTML` property available in JavaScript/DOM. XML::LibXML does not provide this feature, and the reason is probably that, unlike with HTML, an XML snippet requires a single* root element in order to be valid XML. You could still achieve it by getting the list of child nodes (see section A above), calling the `toString` method on each, and concatenating the resulting strings: `my $doc = XML::LibXML->load_xml( ... ); foreach my $parent ($doc->findnodes('//Parent')) { print "Found Parent node with the following XML content:"; print innerXML($parent); } sub innerXML { join '', map { $_->toString } shift->childNodes(); }` [download] Or by calling `toString` directly on the `Parent` node, and using regexes to try and strip off the outer start/end tags (but this will be messy and error-prone).	[reply] [d/l] [select]
Re: XML dont include parent node by derby (Abbot) on Dec 10, 2013 at 17:39 UTC
You could always use the start-with function: `my @nodes = $root->findnodes( '/Root//[starts-with(name(), "child")] +' );` [download] -derby	[reply] [d/l]
Re^2: XML dont include parent node by smls (Friar) on Dec 11, 2013 at 01:07 UTC
I think you're taking the OP's example XML snippet too literally... :)	[reply]
Re: XML dont include parent node by Discipulus (Canon) on Dec 11, 2013 at 09:43 UTC
Hello there Remember for future questions to be as precise as you can, so that others can help you effectively (consider to read Understanding-and-Using-PerlMonks). I humbly think that xpath are not intended to do match as in your case (child1, child2, child3..). I also think another design for your data will be better, if you can choice: all tag are 'child' and each one have a numerical 'id'. In Perl there are many way to get the work done (and speaking about xml they are many * many.. see the poll), so I present a XML::Twig solution. Handlers are subs that are called during parsing, here you can use a normal Perl regex to filter unwanted results (i putted an 'ufo' in the xml data..). use warnings; use strict; use XML::Twig; my $xml=<<'XML'; <Root> <Parent> <child1>Child 1</child1> <child2>Child 2</child2> <child3>Child 3</child3> <ufo> Ufo there!</ufo> </Parent> </Root> XML my $twig= new XML::Twig( pretty_print => 'indented', twig_handlers => { '/Root/Parent/' => \&fie +ld }, ); $twig->parse( $xml); sub field { my( $twig, $field)= @_; return unless $field->gi() =~ /^child/i; $field->print; #OR print $field->text(); } #OUTPUT # # <child1>Child 1</child1> # <child2>Child 2</child2> # <child3>Child 3</child3> [download] Hth L There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l]
Re^2: XML dont include parent node by Anonymous Monk on Dec 11, 2013 at 09:50 UTC
I humbly think that xpath are not intended to do match as in your case (child1, child2, child3..) Hmm, but they seem to do exactly that, which kinda means they are intended to do it `$ xmllint --xpath " //Parent " foot.xml <Parent> <child1/> <child2/> <child3/> </Parent> $ xmllint --xpath " //Parent/* " foot.xml <child1/><child2/><child3/> $ xmllint --xpath " /Root//[starts-with(name(), 'child')] " foot.x +ml <child1/><child2/><child3/>` [download] ... design for your data ... I think too often the person asking how-something-xml doesn't have a choice in the design :)	[reply] [d/l]
Re^3: XML dont include parent node by Discipulus (Canon) on Dec 11, 2013 at 12:49 UTC
i suspected to be wrong there. thanks. Maybe that syntax (strats-with(name)..) is not available in XML::Twig, or, more probably i'm not able to get rid of: `my @all = $twig->get_xpath ('/Root/Parent/child'); #gives:error in xpath expression... my @all = $twig->get_xpath ('/Root//[starts-with(name(), "child")] +'); #also gives error in xpath expression.. #someresults with the findnodes method..` [download] XML::Twig docs says these methods are similar to the XML::LibXML method. Being probably 'similar' the key word. Albeit, if someone is able...welcome! L There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l]

Back to Seekers of Perl Wisdom

Log In^?

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: perlquestion [id://1066468]
Approved by taint
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others drinking their drinks and smoking their pipes about the Monastery: (6)

As of 2024-04-24 09:27 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found