Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

sample of xml to parse:
<host name=jimmy> <function>web</function> <function>dns</function> <location>miami</function> </host>
My code to parse:
#!/usr/bin/perl # use module use XML::Simple; use Data::Dumper; # create object $xml = new XML::Simple (KeyAttr=>[]); # read XML file $data = $xml->XMLin("hosts.xml"); foreach $e (@{$data->{host}}) { print $e->{name}, "\n"; print $e->{function}, "\n"; }
My problem is this: when machines have multiple duties, they do not get printed when I attempt to parse the xml file. I am sure there is a simple solution. Thanks for your help.

Updated Steve_p - added code tags

Replies are listed 'Best First'.
Re: XML parsing
by crashtest (Curate) on May 09, 2005 at 22:41 UTC
    I worked through your XML example (which was malformed... I hope you made mistakes while re-typing, or else you really need to fix the source XML file). Your problem seems to be that XML::Simple returns different types of data depending on what's in the XML file. If it finds only one element, it puts a scalar in the returned hash. If it finds more than one, it returns a reference to an array of data.

    Not having used this module myself, I thought this was difficult behavior, but after RTFM, it turns out there is an option (clearly marked "important") to force certain elements to always be represented as arrays, even if there is only one. I turned that on and then got the code to run on a modified version of your XML block. I tried to mark the important parts in the code below:
    #!/usr/bin/perl # use module use strict; use warnings; use XML::Simple; use Data::Dumper; # create object my $xml = new XML::Simple (KeyAttr=>[], ForceArray => ['function']); # IMPORTANT ^^^^^^^^^^ # read XML file my $data = $xml->XMLin("hosts.xml"); my $e = $data->{'host'}; print "Name: ", $e->{name}, "\n"; # IMPORTANT: Use a loop to process all the possible # functions, instead of just one. foreach my $fct ( @{$e->{'function'}} ){ print "Function: ", $fct, "\n"; }
    <allHosts> <host name="jimmy"> <function>web</function> <function>dns</function> <location>miami</location> </host> </allHosts>
    Please use strict and warnings int he future, as it would have alerted you to the problems in your code instead of silently giving you no output.
      Using your code, I get the following error

      Pseudo-hashes are deprecated at ./test.pl line 19.
        Please don't just "use my code". Your XML file is different from the block I used, in that I only have one host and you have several.

        The key point (I'm sorry if it wasn't clear) I was making was that XML::Simple returns a reference to an array when there is more than one element with the same name. You need to either differentiate between the "singular" and the "plural" using ref, or force the element to be an array reference all the time using ForceArray => 1 in your options.

        The structure and form of the data XML::Simple returns depends on the XML it parses. I think that's the crux of your problem.

Re: XML parsing
by jhourcle (Prior) on May 09, 2005 at 22:04 UTC

    I'm surprised you get as far as you do.

    Your XML file is invalid -- the value of the 'name' attribute isn't quoted.

    I'd suggest using Data::Dumper to print the structure that XMLin returned. (you've already loaded Data::Dumper, you might as well make use of it).

      Sorry, it was mistyped in this form (i should've copy and pasted...). The name attribute is quoted in my xml file. So, the data in the dumped file is not that simple for me to read. How do I make sense of the following: $VAR1 = {
      'host' => {
      'jimmy.foo.com' => {
      'hardware' => 'eht0', 'eht1' ,
      'duty' => 'web'
      },
      The list goes on, but this is detailed info from one host. How do I make this useful. Thanks for being patient with me.

        When you're posting code on here, you really should wrap your copy/pasted items with <code> ... </code>. I'm guessing you probably saw:

        $VAR1 = { 'host' => { 'jimmy.foo.com' => { 'hardware' => ['eht0', 'eht1'], 'duty' => 'web' }, }, };

        So basically, you have a hash of hashes of hashes of something that might be an array. What you'll want to do is force them all to be an array (pass in ForceArray => [ qw(duty) ] to XMLin, or when you create a new XML::Simple object), or you'll have to check for each one:

        my $hosts = $data->{'host'}; foreach my $host ( keys %$hosts ) { my $duties = $hosts->{$host}->{'duty'}; print "$host : $_\n" foreach ( UNIVERSAL::isa( $duties, 'ARRAY' ) ? @$duties : $duties ); }
Re: XML parsing
by devnul (Monk) on May 09, 2005 at 22:54 UTC
    I'm having a really hard time reading your example (please use code tags?).. Understanding your result is difficult too. I'll make some "guesses" at what you might be trying to do and hope this helps...

    I really do not know what you meant to do with the XML document you posted from your comments, so I am left to guess at what you meant to do. Here is my version:
    <hosts> <host name="jimmy"> <function>web</function> <function>dns</function> <location>miami</location> </host> <host name="haffa"> <function>nfs</function> <function>quake</function> <location>hell</location> </host> </hosts>

    Now here's the code to parse it: (I did not know what you meant to do with the hash you passed into XMLin, so I removed it)

    use XML::Simple; use Data::Dumper; # create object my $xml = new XML::Simple (); # read XML file my $data = $xml->XMLin("/tmp/hosts.xml"); print Dumper($data);


    And here is the output:
    $VAR1 = { 'host' => { 'haffa' => { 'function' => [ 'nfs', 'quake' ], 'location' => 'hell' }, 'jimmy' => { 'function' => [ 'web', 'dns' ], 'location' => 'miami' } } };

    I think the results are fairly self-explanatory?

    - dEvNuL