shak has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I have a Huge XML file I need to extract data below with XMLLIB
OUTPUT
VAL1 ,0,0 VAL2,0,0 VEL3,0,0 VAL4,0,0 VAL5 ,490783914,4532
My code
my $parser = XML::LibXML->new(); my $doc = $parser->parse_file($filename); foreach my $book ($doc->findnodes('/mdc/md/mt[text()="VAL1"]') { $val1=$book->findnodes('./r[1]/text ()'); push (@Val,$val1) }
Input XML File
<p> <mdc xmlns:HTML="http://www.w3.org/TR/REC-xml"> <md> <neid> <neun></neun> <nedn>GET_SUB</nedn> <nesw>R4BA06</nesw> </neid> <mi> <mts>20150429141500Z</mts> <gp>900</gp> <mt>VAL1</mt> <mt>VAL2</mt> <mt>VAL3</mt> <mt>VAL4</mt> <mt>VAL5</mt> <mt>VAL6</mt> <mt>VAL7</mt> <mt>VAL8</mt> <mv> <moid>NAME</moid> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>490783914</r> <r>0</r> <r>0</r> <r>0</r> </mv> <mv> <moid>NAME1</moid> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>4532</r> <r>0</r> <r>0</r> <r>0</r> </mv> </mi> </md> </mdc>

Update

I was not aware of "count(preceding-sibling)" function ,your code just worked fine.
I tried google what are the different option available in LibXML but could not find can u help me on that.
If wanted the output some thing like
VAL1,NAME,0 VAL1,NAME1,0 VAL2,NAME,0 VAL2,NAME1,0

Replies are listed 'Best First'.
Re: Extraction of value with XMLLIB
by choroba (Cardinal) on May 05, 2015 at 15:00 UTC
    The basic problem is your XPath expression doesn't follow the structure of the document. <mt> is not a child of <md>, there's a <mi> in between. Therefore, something like the following should match all the VAL nodes:
    /mdc/md/mi/mt[contains(.,"VAL")]

    The same holds for the <r> elements: their parent is <mv>.

    The following works for me:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML; my @val; my $doc = 'XML::LibXML'->load_xml( location => 'file.xml' ); for my $book ($doc->findnodes('/mdc/md/mi/mt[contains(.,"VAL")]')) { my $order = 1 + $book->findvalue('count(preceding-sibling::mt)'); my $rs = $book->findnodes("../mv/r[$order]"); say join ', ', map $_->textContent, $book, @$rs; }

    You can get the same logic with XML::XSH2, which is a wrapper around XML::LibXML:

    open file.xml ; for /mdc/md/mi/mt[xsh:match(.,'^VAL[0-9]+')] { my $order = 1 + count(preceding-sibling::mt) ; echo xsh:join(', ', (.), ../mv/r[$order]) ; }

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Hi All, Thanks for the information "count(preceding-sibling)" helped , can have the feature available in LibXML Also is it possible to add the NAME also. Once Again thanks
        I don't understand. Are you requesting the count feature to exist in LibXML? My first example shows it exists already.

        There was no NAME in your original question, only SNAME. There's only one SNAME per CLT, so it's not clear what output you want.

        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Extraction of value with XMLLIB
by Anonymous Monk on May 05, 2015 at 14:56 UTC
    So what is the problem?