madbee has asked for the wisdom of the Perl Monks concerning the following question:
Hello! I have an XML document which is below. For this XML, I need to extract the node value based on the keyword. i.e Based on the keyword "design", I need to extract the entire string between header nodes.
<root> <part> <sect> <header> This is a design XZY document for Project </header> </sect> </part> </root> For this, I have the below Perl script: my $dom = XML::LibXML->new->parse_file($file); my $nodeset = $dom->find('/root/part/sect/header'); foreach my $node($nodeset -> get_nodelist) { $node -> string_value; if ($node =~ m/design/i) { my $design= $node; print $design; } }
The problem is, I need to do this across multiple xmls for which I noticed that the string I am looking for is in another part of the doc. example: it is under:
<root> <para>This is a design XZY document for Project</para> <part> <sect> <header> This is some header </header> </sect> </part> <root>
The value occuring under root/para tags is an anamoly but valid which I have to accomodate for. Given such irregular xmls, is there a way I can incorporate these 2 scenarios using one generic code? Ofcourse, a much devious roundabout way would be to first check the valid node and if not found then go back to under root. But I was wondering if there is a simpler way to do this and was hoping for some help here.
Thanks in advance for your time and apologies if the question is not clear enough.
Regards, Madbee
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Get Node Value from irregular XML
by roboticus (Chancellor) on Jun 29, 2013 at 17:16 UTC | |
by madbee (Acolyte) on Jun 29, 2013 at 18:19 UTC | |
by roboticus (Chancellor) on Jun 29, 2013 at 19:25 UTC | |
|
Re: Get Node Value from irregular XML
by NetWallah (Canon) on Jun 29, 2013 at 17:46 UTC | |
|
Re: Get Node Value from irregular XML (xpather.pl)
by Anonymous Monk on Jun 30, 2013 at 04:15 UTC | |
by Anonymous Monk on Jun 30, 2013 at 04:41 UTC | |
by Anonymous Monk on Jun 30, 2013 at 05:13 UTC |