Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Task : Parsing the xml returned from the webserver.
Input : post a request the webserver
output : The webserver returns an XML.
problem : processing the XML (for more than 1 child element)

Below is the XMl file

<root> <report id="0" ip="1.1.1.1"> <node mid="machine1" name="Winxp"> <values>1, 2, 3, 4, 5, 6</values> <time>110, 120, 130, 140, 150, 160</time> </node> </root>

Below is the perl program

use XML::Simple; my $simple = XML::Simple -> new (); my $tree = $simple->XMLin($xmlfile); @values = split ',', $tree->{report}{node}{values}; @time = split ',', $tree->{report}{node}{time};

Using the above program, I was able to get the values and time stamps in the array namely @values and @time. ( note there is only 1 element <node> )

But now i am to parse more than 2 child elements.

<root> <report id="0" ip="1.1.1.1"> <node mid="machine1" name="Winxp"> <values>1, 2, 3, 4, 5, 6</values> <time>110, 120, 130, 140, 150, 160</time> </node> <node mid="machine2" name="Win2003"> <values>1, 2, 3, 4, 5, 6</values> <time>110, 120, 130, 140, 150, 160</time> </node> </root>

Note : I know the node mid and name attribute. ( I mean, I know i am requesting for 2 machines and the machine id and name are the part of the note element as attributes. )

Now how do i query that particular node and then remove its values and time element contents.

Code tags and other markup added by GrandFather

Replies are listed 'Best First'.
Re: Parsing XML file for more than 1 child element with attributes
by Cody Pendant (Prior) on Feb 06, 2008 at 11:21 UTC
    You could do it this way (XML::Simple creates a hash of hashes because the nodes have an attribute called "name", I guess):
    foreach $node ( keys( %{ $tree->{report}->{node} } ) ) { print "$node:values: $tree->{report}->{node}->{$node}->{values}\n"; }
    But you'd be better off adding KeyAttr to your new() call:
    my $simple = XML::Simple -> new (KeyAttr => 'node');
    Because that way you will get an array of nodes
    $tree = { 'report' => { 'ip' => '1.1.1.1', 'id' => '0', 'node' => [ { 'mid' => 'machine1', 'time' => '110, 120, 130, 140, 150, 160', 'name' => 'Winxp', 'values' => '1, 2, 3, 4, 5, 6' }, { 'mid' => 'machine2', 'time' => '110, 120, 130, 140, 150, 160', 'name' => 'Win2003', 'values' => '1, 2, 3, 4, 5, 6' } ] } };
    Hint -- use Data::Dumper to look at the tree.


    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
      Thanks. Using Data Dumper i was able to see the tree structure and this is the structure that i was looking for. I tried the for loop that you have provided "foreach $node ( keys( %{ $tree->{report}->{node} } ) ) { print "$node:values: $tree->{report}->{node}->{$node}->{values}\n"; }" But here what i observe is that the keys are obtained as mid, time name and values. I am trying to accomplish something like " if the mid is 'machine1' and name is 'Winxp' then i would like to get the 'time' and 'values'. " Similary i will loop across many mid and names. Since i already know the mid and name, it is easy. But the problem is how to construct code for this task ?
        Well, I think the reason nobody's helping you is, you haven't shown any of your own code yet.

        If you understand how

        foreach $node ( keys( %{ $tree->{report}->{node} } ) ) { print "$node:VALUES: $tree->{report}->{node}->{$node}->{values}\n"; }
        works, then you should be able to just add an "if" to my code to get what you want. If you don't, or if you don't even know how to do "if", in Perl, you'd better say so.


        Nobody says perl looks like line-noise any more
        kids today don't know what line-noise IS ...
Re: Parsing XML file for more than 1 child element with attributes (XML::Twig)
by Tanktalus (Canon) on Feb 06, 2008 at 15:39 UTC

    XML::Simple may work great for simple stuff, and though your new requirement may still be simple, it seems like it won't take long for you to surpass that. Personally, I find XML::Twig simple enough for the simple stuff, and no more complex for the complex stuff... so I'd recommend going that way sooner rather than later.

    That said, this doesn't appear to be valid XML: there is no close tag for 'report'. So it seems unfortunate that XML::Simple doesn't complain (it's not really XML anymore). Anyway, if I just close that tag immediately, I get:

    #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $twig = XML::Twig->new(); $twig->parse(\*DATA); for my $node ($twig->get_xpath('/root/node')) { print "Machine: ", $node->att('mid'), "\n"; my ($values) = $node->get_xpath('values'); print "\tvalues: ", $values->text(), "\n"; my ($time) = $node->get_xpath('time'); print "\ttime: ", $time->text(), "\n"; } __END__ <root> <report id="0" ip="1.1.1.1" /> <node mid="machine1" name="Winxp"> <values>1, 2, 3, 4, 5, 6</values> <time>110, 120, 130, 140, 150, 160</time> </node> <node mid="machine2" name="Win2003"> <values>1, 2, 3, 4, 5, 6</values> <time>110, 120, 130, 140, 150, 160</time> </node> </root>
    And that gives:
    Machine: machine1 values: 1, 2, 3, 4, 5, 6 time: 110, 120, 130, 140, 150, 160 Machine: machine2 values: 1, 2, 3, 4, 5, 6 time: 110, 120, 130, 140, 150, 160
    If you close the report after the last node, the $twig->get_xpath will have to change to /root/report/node, or you can just use //node which will work either way.

      Just for the record, XML::Simple does choke on the non-well-formed XML. It dies with a very useful message.

      I fixed the bad XML but didn't mention it for my example.



      Nobody says perl looks like line-noise any more
      kids today don't know what line-noise IS ...
        Thanks Cody Pendant and Tanktalus. my $simple = XML::Simple -> new (KeyAttr => 'node'); I thought the above was simple as the xml tree is returned as an hash of an hash and so on. But when i saw the structure entries like and made me think. Since i want a subroutine it will be easy for me to use. Hence i plan to use use XML::Twig; I can pass the XML as a string, mid and name to the subroutine. The return from the subroutine can be the values and timestamps.