Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

XML::LibXML & namespaces

by breezykatt (Acolyte)
on Oct 03, 2016 at 21:37 UTC ( [id://1173191]=perlquestion: print w/replies, xml ) Need Help??

breezykatt has asked for the wisdom of the Perl Monks concerning the following question:

Hi - I'm using XML::LibXML to read/parse various xml files. I'm running into an issue where the xml files contain namespaces in different places (either at the top of page or within the xml tags). Additionally, the namespace numbers change per xml file for the same tags. So I wrote a script to parse the namespaces and use them accordingly. The script only works for when the namespaces are at the top of the page. When the namespaces are within the tags, I'm running into the following error:

XPath error : Undefined namespace prefix
error : xmlXPathCompiledEval: evaluation failed

The code is straight forward, but I must be missing something obvious. I tried putting in the full page, changing "//" to "/" with full path. I tried other tags at different levels in the xml hierarchy, but the only time it works is when there isn't a namespace used. Any ideas on how to get this working? Thanks.
#$rpt is just a namespace number retrieved from sub and is correct for + each file. $nsdevices="//ns" . $rpt. ":device"; my @devices = $doc->findnodes($nsdevices); print "nsdevices:$nsdevices\n";

XML snippet for where it fails
<?xml version='1.0' encoding='utf-8'?><Notify xmlns:xsd="http://www.w3 +.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-inst +ance" xmlns="http://docs.oasis-open.org/wsn/b-2"> <NotificationMessage> <Topic Dialect="http://docs.oasis-open.org/wsn/t-1/TopicExpression/Sim +ple">TOPICNAME</Topic> <ProducerReference> <Address xmlns="http://www.w3.org/2005/08/addressing">address</Address +> <Metadata xmlns="http://www.w3.org/2005/08/addressing"> <ns2:MessageID xmlns:ns2="http://www.w3.org/2005/08/addressing">msgid< +/ns2:MessageID> </Metadata> </ProducerReference> <Message> <ns1:rpt xmlns:ns1="http://www.url.com/path/for/rpt"> <ns1:reportObject> <ns1:device timestamp="2016-01-01T00:00:00.000-00:00">

Replies are listed 'Best First'.
Re: XML::LibXML & namespaces
by choroba (Cardinal) on Oct 03, 2016 at 23:10 UTC
    Use XML::LibXML::XPathContext to work with namespaces in XML::LibXML:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML; my $dom = 'XML::LibXML'->load_xml(location => shift); my $xpc = 'XML::LibXML::XPathContext'->new($dom); my $rpt = 1; $xpc->registerNs("ns$rpt", 'http://www.url.com/path/for/rpt'); my $nsdevices = "//ns$rpt:device"; my @devices = $xpc->findnodes($nsdevices); say $_->getAttribute('timestamp') for @devices;

    Or, if you find it too verbose, try XML::XSH2:

    open file.xml ; register-namespace rpt http://www.url.com/path/for/rpt ; for //rpt:device echo @timestamp ;

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Thanks, it is a bit verbose, but your suggestion worked. I had everything in various loops depending on the xml hierarchy to bring back the values. In this case it brings back all the values at the same time, such as "value1value2value3value4" etc... for an element. Now I guess I have to figure out how to parse that or get them individually....fun :)
        In this case it brings back all the values at the same time, such as "value1value2value3value4" etc... for an element.

        I will admit that I am a beginner with respect to XML, but I think you took a leap of the imagination and you lost me.

        Here is the program:

        #!/usr/bin/perl # xml-libxml-ex1.pl perl xml-libxml-ex1.pl file.xml XML::LibXML + & namespaces # From http://www.perlmonks.org/?node_id=1173200 Re: XML::LibXML & na +mespaces by choroba on Oct 03, 2016 at 19:10 EDT use warnings; use strict; use feature qw{ say }; use XML::LibXML; my $dom = 'XML::LibXML'->load_xml(location => shift); my $xpc = 'XML::LibXML::XPathContext'->new($dom); my $rpt = 1; $xpc->registerNs("ns$rpt", 'http://www.url.com/path/for/rpt'); my $nsdevices = "//ns$rpt:device"; my @devices = $xpc->findnodes($nsdevices); say $_->getAttribute('timestamp') for @devices; # This prints: # 2016-01-01T00:00:00.000-00:00 __END__

        Here is the input (which I had to make modifications to, in order to get the so-called "snippet" to work with E. Choroba's program.):

        <?xml version='1.0' encoding='utf-8'?> <Notify xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http:/ +/www.w3.org/2001/XMLSchema-instance" xmlns="http://docs.oasis-open.or +g/wsn/b-2"> <NotificationMessage> <Topic Dialect="http://docs.oasis-open.org/wsn/t-1/TopicExpression/Sim +ple">TOPICNAME</Topic> <ProducerReference> <Address xmlns="http://www.w3.org/2005/08/addressing">address</Address +> <Metadata xmlns="http://www.w3.org/2005/08/addressing"> <ns2:MessageID xmlns:ns2="http://www.w3.org/2005/08/addressing">msgid< +/ns2:MessageID> </Metadata> </ProducerReference> <Message> <ns1:rpt xmlns:ns1="http://www.url.com/path/for/rpt"> <ns1:reportObject> <ns1:device timestamp="2016-01-01T00:00:00.000-00:00"> </ns1:device> </ns1:reportObject> </ns1:rpt> </Message> </NotificationMessage> </Notify>

        I named the input file file.xml

        Here is the output:

        2016-01-01T00:00:00.000-00:00

        Sorry, but I see one value. Perhaps I am not playing with a full deck, i.e., you have a different input file than I have, and you are asking a question regarding output that you have obtained from your input file, but I can only guess what your input file looks like.

        Anyhow, I was able to get E. Choroba's program to work on my machine which runs Strawberry Perl, and I have learned something from him, yet again.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1173191]
Approved by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-24 21:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found