XML::LibXML::Reader and XPATH

mat21 has asked for the wisdom of the Perl Monks concerning the following question:

Dear all,
I am using XML::LibXML::Reader to parse large XML files and I am trying to use some methods based on XPATH.
There is something (probably simple) that I don't understand.
For instance, I create a pattern

 
my $reader = new XML::LibXML::Reader(location => $xmlfile) or die "can
+not read $xmlfile\n";
while ($reader->read) {
   my $pattern = XML::LibXML::Pattern->new('//entry');
   $reader ->nextPatternMatch($pattern);
}
[download]

there is no match although many entry tags are in the xml file (same behaviour for any tag).
the first node of the XML file contains information about schema

<uniprot xmlns="http://uniprot.org/uniprot" xmlns:xsi="http://www.w3.o
+rg/2001/XMLSchema-instance" xsi:schemaLocation="http://uniprot.org/un
+iprot http://www.uniprot.org/support/docs/uniprot.xsd">
[download]

if I replace it by only <uniprot> all my xpath queries work. I guess I have to do something to declare the schema, but the tags do not have any prefix like examples in the documentation
I don't know what to do. Any advice would be welcome.
thanks

Comment on XML::LibXML::Reader and XPATH Select or Download Code

Replies are listed 'Best First'.
Re: XML::LibXML::Reader and XPATH by dHarry (Abbot) on Feb 20, 2009 at 11:06 UTC
This has nothing to do with Perl and/or libxml, instead it is about XML. but the tags do not have any prefix like examples in the documentation Well you can add the prefixes, i.e. qualify the tags with the namespace. But the uniprot element declaration looks suspicious to me. You might want to spend some time on choosing the right strategy for determining what the default/target namespace should be, see DefaultNamespace.pdf for a discussion. This gives a few examples on how to do it. HTH	[reply]
Re^2: XML::LibXML::Reader and XPATH and default namespace by mat21 (Beadle) on Feb 20, 2009 at 15:12 UTC
Thanks for your answer and the link. it helped me to find another interesting link and it is not so simple XPath vs the default namespace I don't think the use of XML::LibXML::XPathContext is compatible XML::LibXML::Reader which is annoying. The files I am parsing are really big and I don't want to use DOM or SAX...	[reply]
Re^3: XML::LibXML::Reader and XPATH and default namespace by dHarry (Abbot) on Feb 20, 2009 at 16:12 UTC
For big files DOM is out of the question though there always tricks of course. With SAX I've parsed big files with good performance (Personally I favor the Xalan and Xerces implementations of Apache). Although I do use libxml2 I invariably use XML::Twig when I am in a Perl environment. I have parsed files over 1 GB with it. There is also XML::Twig::XPath but I never used it. You would have to check if it solves your problem. The document I mentioned is only one from a (big) series. They elaborate the best practices for XML-Schema usage.	[reply]