Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: XML parsing and Lists

by madbee (Acolyte)
on Jul 05, 2013 at 00:23 UTC ( [id://1042553]=note: print w/replies, xml ) Need Help??


in reply to Re: XML parsing and Lists
in thread XML parsing and Lists

Thanks for responding.I was not aware of XPathContext module in LibXML. I was trying to use XML::XPath directly and got into an infinite loop of installation issues which I could not get past.

I will try using this approach. Basically, I have to create a array of nodes for the path:

$parser = XML::LibXML->new; $dom = $parser->parse_file($file); $root = $dom->getDocumentElement; $dom->setDocumentElement($root); my $xc = XML::LibXML::XPathContext->new($file); my @nodes=$xc->findnodes('//Article//Part//Sect//H5[ contains(.,"I +nclude")]',$dom); if (@nodes) { $count = $xc->findvalue('count(//Article//Part//Sect//LI)',$dom); print $count; }

Am I on the right track? Anything I'm missing?

Thanks much.

Replies are listed 'Best First'.
Re^3: XML parsing and Lists
by choroba (Cardinal) on Jul 05, 2013 at 00:50 UTC
    You are overcomplicating the problem. Do not use setDocumentElement, it creates a new root element. The constructor of XPathContext takes a context node as a parameter, not a file. This is a Short, Self Contained, Correct Example:
    #!/usr/bin/perl use warnings; use strict; use XML::LibXML; my $dom = XML::LibXML->load_xml(string => << '__XML__'); <Article> <!-- fixed typo --> <Main> <Sect> <H4>Include</H4> ..... <P1> This is the criteria</P1> <L> <LI> <LI_Label>1.</LI_Label> <LI_Title>Critera 1</LI_Title> </LI> <LI> <LI_Label>2.</LI_Label> <LI_Title>Critera 2</LI_Title> </LI> <LI> <LI_Label>3.</LI_Label> <LI_Title>Critera 3</LI_Title> </LI> <LI> <LI_Label>4.</LI_Label> <LI_Title>Critera 3</LI_Title> </LI> </L> <!-- fixed missing closing tag --> </Sect> </Main> </Article> __XML__ my $xc = XML::LibXML::XPathContext->new; my $count = $xc->findvalue('count(//Article//Sect//LI)', $dom); print "$count list nodes found.\n" if $count;
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      well, you're not really using xpathcontext for anything, so it isn't required , this works

      $dom->find('count(//Article//Sect//LI)' );

      Hello! Tried the xpathcontext approach and works great. However, I do need to check for the condition: "Where H4 contains Include". The document can have multiple sections as above and I only need to count the list elements of this particular section only.

      This is the expression I am trying, which I know is wrong since this is now looking under H4. I am not sure if its even possible to combine the two conditions at all in one expression. So looking for some help here.

      objective:counting the number of LI under //Article//Main//Sect where value of H4 contains "include"

      $count = $dom->findvalue("count(//Article//Main//Sect//H4[contains(.,\ +"Include\")]/LI)"); print $count;

      greatly appreciate any help in this regard. Thanks!

        LI is not part of the H4. Move H4 into the condition:
        'count(//Article//Sect[contains(H4,"Include")]//LI)'
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Yes, that was an error. I dont need to cretae a new root. However, I do need to search for the "Include" node since the document can have multiple H5 nodes.

      Speaking of document, the XML structure is not consistent across documents. In this, my List is in H5. But in another it can be anywhere else. So, it give a range of paths, could you please let me know if this is correct?

      Thanks again for your help. Really appreciate your time

      my $xc = XML::LibXML::XPathContext->new; my $count = $xc->findvalue('count(//Article//Sect//LI|//Article//Sect/ +/Part//LI|//Article//Part//Li)', $dom); print "$count list nodes found.\n" if $count;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1042553]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-25 20:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found