clapeters has asked for the wisdom of the Perl Monks concerning the following question:

I have a network that I monitor for performance. Each of the many nodes creates a performance monitoring XML file every 15 minutes. I would like to be able to extract specific data from the XML files, perform calculations on those bits of data, then create a gnuplot from that. I have what I need for the chart (gnuplot), but I cannot figure out how to get the data programmatically from the XML files.

You'll notice that the XML is in a format similar to this:

counter1 counter2 counter3 counter4 value1 value2 value3 value4

I am (obviously) new to this and was hoping for some pointers to get me going. I've read XML files with PERL in the past and have had a lot of luck with XML::Simple, but I can't get this working with my current files.

Specifically, I'd like to be able to extract Counter1/Value1 & Counter3/Value3, etc.

Thanks in advance!

Sample XML below. Let's say I'm looking for <mt>pmLicUlPrbCapDistr</mt> which can be found in <r>90083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0</r> below.

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?> <!DOCTYPE mdc SYSTEM "MeasDataCollection.dtd"> <mdc xmlns:HTML="http://www.w3.org/TR/REC-xml"> <mfh> <ffv>32.401 V6.2</ffv> <sn>SubNetwork=ONRM_ROOT_MO_R,SubNetwork=SAMPLE,MeContext=LTE0 +001_SITE</sn> <st></st> <vn></vn> <cbt>20140401204500Z</cbt> </mfh> <md> <neid> <neun>LTE0001_SITE</neun> <nedn>SubNetwork=ONRM_ROOT_MO_R,SubNetwork=SAMPLE,MeContex +t=LTE0001_SITE</nedn> <nesw>CXP102051/18_R21BN</nesw> </neid> <mi> <mts>20140401210000Z</mts> <gp>900</gp> <mt>pmPdcpPktDiscDlEth</mt> <mv> <moid>ManagedElement=1,ENodeBFunction=1</moid> <r>0</r> </mv> </mi> </md> <md> <neid> <neun>LTE0001_SITE</neun> <nedn>SubNetwork=ONRM_ROOT_MO_R,SubNetwork=SAMPLE,MeContex +t=LTE0001_SITE</nedn> <nesw>CXP102051/18_R21BN</nesw> </neid> <mi> <mts>20140401210000Z</mts> <gp>900</gp> <mt>pmLicDlCapActual</mt> <mt>pmLicUlCapActual</mt> <mt>pmLicDlPrbCapActual</mt> <mt>pmLicUlPrbCapActual</mt> <mt>pmPdcpPktDiscDlEth</mt> <mt>pmPdcpPktDiscUlEthPacing</mt> <mt>pmLicDlCapDistr</mt> <mt>pmLicUlCapDistr</mt> <mt>pmLicDlPrbCapDistr</mt> <mt>pmLicUlPrbCapDistr</mt> <mv> <moid>ManagedElement=1,Equipment=1,Subrack=1,Slot=1,Pl +ugInUnit=1,DeviceGroup=dul,BbProcessingResource=1</moid> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>81893,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0</r> <r>90083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0</r> <r>81893,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0</r> <r>90083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0</r> </mv> </mi> </md> <mff><ts>20140401210000Z</ts></mff> </mdc>

Replies are listed 'Best First'.
Re: Extract Data from XML (unique xml format?)
by tangent (Parson) on Jun 16, 2015 at 20:49 UTC
    I'm not sure if I understand your requirement exactly, but here is a way to extract the values using XML::LibXML and Xpaths:
    use XML::LibXML; my $doc = XML::LibXML->load_xml(string => $xml); my @mds = $doc->findnodes('//md'); for my $md ( @mds ) { my @mts = $md->findnodes('mi/mt'); next unless @mts; my @rs = $md->findnodes('mi/mv/r'); for my $i ( 0 .. $#mts ) { my ( $mt, $r ) = ( $mts[$i], $rs[$i] ); print "mt: ", $mt->textContent, "\n"; print "r: ", $r->textContent, "\n\n"; } }
    Output using your example XML:
    mt: pmPdcpPktDiscDlEth r: 0 mt: pmLicDlCapActual r: 0 mt: pmLicUlCapActual r: 0 mt: pmLicDlPrbCapActual r: 0 mt: pmLicUlPrbCapActual r: 0 mt: pmPdcpPktDiscDlEth r: 0 mt: pmPdcpPktDiscUlEthPacing r: 0 mt: pmLicDlCapDistr r: 81893,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 mt: pmLicUlCapDistr r: 90083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 mt: pmLicDlPrbCapDistr r: 81893,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 mt: pmLicUlPrbCapDistr r: 90083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
      use XML::LibXML; my $string = q| <Root> <Parent> <child1>Child 1</child1> <child2>Child 2</child2> <child3>Child 3</child3> </Parent> </Root>|; my $doc = XML::LibXML->load_xml(string => $string); my @nodes = $doc->findnodes('//Parent'); for my $node (@nodes) { my @childnodes = $node->childNodes or next; for my $cnode (@childnodes) { if ($cnode->nodeName =~ m/^child/) { print 'name: '. $cnode->nodeName . ', '; print 'content: '. $cnode->textContent ."\n"; } } } # Output: # name: child1, content: Child 1 # name: child2, content: Child 2 # name: child3, content: Child 3

        This works as well AjitKhodke. Again, I so appreciate your help and responses from all.

      Thanks so much tangent. This appears to be exactly what I'm looking for.

      I apologize for not being more clear. It made sense in my head, but not sure that's the best indicator of clarity.

      ++,” and, without question, this is how I would recommend doing it.

      The overwhelming-advantage of XPath is that it lets you describe, in nothing more than a simple text-string, what you want.   It’s up to the XPath engine to figure out how to do it.   The example shown performs many searches and there’s not a drop of “how” code in it.   You want such an approach, that lets you be very “what if...?” in what you search-for and subsequently plot without writing code.

        simple text string? funny

Re: Extract Data from XML (unique xml format?)
by GotToBTru (Prior) on Jun 16, 2015 at 20:11 UTC

    As the docs for XML::Simple indicate, you probably want to use one of the more recent modules. It suggests XML::LibXML. XML::Twig is popular here on PM. My own very limited experience with XML suggests that you first need to be good with xpaths.

    Dum Spiro Spero
Re: Extract Data from XML (unique xml format?)
by Discipulus (Canon) on Jun 16, 2015 at 20:31 UTC
    Let's say I'm looking for <mt>pmLicUlPrbCapDistr</mt> which can be found in <r>90083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0</r> below.

    I cannot understand what are you trying to extract.
    Anyway give a try to XML::Twig as suggested

    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.