FlanOSU has asked for the wisdom of the Perl Monks concerning the following question:

I have successfully run the basic XML parsing examples found around the internet, but when I try to apply it to my actual xml files, I am stumped.

I could use some assistance in learning how to access the lower levels of this XML tree. Here is some of the output from Dumper (hopefully this is legible):

$VAR1 = { 'xmlns:xsi' => 'http://www.w3.org/2001/XMLSchema-instance', 'ICD_Name' => 'MFU_icd', 'ICD_Description' => { 'MsgData' => { 'MsgDataRegion' => { 'Msg_Da +ta' => { + 'Array' => { + 'ArrayName' => 'MFU_ID', + 'NumOfArrayRepetitions' => '1', + 'StructureType' => [ + { + 'DataLine' => {

I am trying to read down to the DataLine level, but I can't even get it to output the top level. Here is my code:

#!/usr/bin/perl # use module use strict; use XML::Simple; use Data::Dumper; # read XML file my $xmlfile = "./GA_MFU_Get_Status.xml"; my $ref = eval { XMLin($xmlfile) }; # print entire output print Dumper($ref); foreach my $item (@{$ref->{ICD_Description}}) { print $item->{MsgData}, "\n"; print ": ", $item->{MsgData}->{MsgDataRegion}->{Msg_Data}->{Array} +, "\n"; print ": ", $item->{MsgData}->{MsgDataRegion}->{Msg_Data}->{Array} +->{StructureType}->{DataLine}, "\n"; print ": ", $item->{MsgData}->{MsgDataRegion}->{Msg_Data}->{Array} +->{StructureType}->{DataLine}->{DataField}, "\n"; print "\n"; }

I get the message "Not an ARRAY reference at C:\GA_MFU_ICD_XML\XML\XMLparser.pl line 18." Anything to get my on the right track would be much appreciated.

Replies are listed 'Best First'.
Re: Parsing deep XML
by lostjimmy (Chaplain) on Feb 24, 2010 at 22:39 UTC
    The error message is exactly what the problem is--you are trying to use the reference in $ref->{ICD_Description} as an array, when it's actually a hash. You can't use your foreach loop to traverse through the data in the way you are trying.

    In other words, $ref is a hashref, and $ref->{ICD_Description} is also a hashref, but you're trying to use it like it's an array ref. The output from Dumper is the key to seeing what kind of data are in the variable. If a section starts with a {, it is a hashref; if it starts with a [ it is an array ref. It also looks like StructureType points to an array ref, which then has a list of hashrefs.

    You should be able to access the array of StructureTypes with something similar to the following (untested):

    foreach my $structure_type (@{$ref{ICD_Description}{MsgData}{MsgDataRe +gion}{Msg_Data}{Array}{StructureType}}) { print "$structure_type{DataLine}{DataField}\n"; }
Re: Parsing deep XML
by grantm (Parson) on Feb 25, 2010 at 03:22 UTC

    Before you get too deep into using XML::Simple, take a look at this article which demonstrates how XML::LibXML can actually be simpler.