zeni has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to extract contents of xml file using the XML parser.

<pThere is a possibilty that multiple entries under same tag name exist.

Ex:

< GEN_FSM >

< State >

< Name >

POWERED

< /Name >

< Comment >

USB POWERED State

< /Comment >

< /State >

< State >

< Name >

DEFAULT

< /Name >

< Comment >

USB DEFAULT State

< /Comment >

< /State >

< State >

< Name >

ADDRESS

< /Name >

< Comment >

USB ADDRESS State

< /Comment >

< /State >

< Rst_Arc >

< To_State >

DEFAULT

< /To_State >

< monitor_mode_event_name >

HOT_RST

< /monitor_mode_event_name >

< monitor_mode_event_count >

1

< /monitor_mode_event_count >

< comment >

HOT_reset_to_DEFAULT

< /comment >

< /Rst_Arc >

< /GEN_FSM >

To extract data under the tag 'State' i need to decide whether to retrieve as an array or a single element since the no of entries under a same tag name vary. One way to decide is keep a counter for 'tag' occurance. And then depending on counter value do an @array or $variable. Appreciate if there is a better and efficeint way! Pls help

Replies are listed 'Best First'.
Re: Extracting data from XML file
by mirod (Canon) on Feb 23, 2010 at 10:41 UTC

    Why do you want to get the result either in an array or in a scalar? Why not always get an array and always process it the same way? Why special case 1-element arrays?

    Whichever you choose, XML::Simple can do it. If you want to always get an array use the ForceArray option, otherwise don't.

    If you are only interested in the data in the State elements, then you can use a number of other modules, XML::LibXML, XML::Rules, or my XML::Twig.

      ++ for always treating it as an array.

      #!/usr/local/bin/perl use strict; use warnings; use XML::LibXML; my $xml = <<_XML; <GEN_FSM> <State> <Name>POWERED</Name> <Comment>USB POWERED State</Comment> </State> <State> <Name>DEFAULT</Name> <Comment>USB DEFAULT State</Comment> </State> <State> <Name>ADDRESS</Name> <Comment>USB ADDRESS State</Comment> </State> <Rst_Arc> <To_State>DEFAULT</To_State> <monitor_mode_event_name>HOT_RST</monitor_mode_event_name> <monitor_mode_event_count>1</monitor_mode_event_count> <comment>HOT_reset_to_DEFAULT</comment> </Rst_Arc> </GEN_FSM> _XML my $parser = XML::LibXML->new(); my $doc = $parser->parse_string( $xml ); my $root = $doc->getDocumentElement; my @nodes = $root->findnodes( '/GEN_FSM/State/Name' ); foreach my $node ( @nodes ) { print $node->textContent, "\n"; }
      -derby
Re: Extracting data from XML file
by Anonymous Monk on Feb 23, 2010 at 06:39 UTC
    In XML file there is a possibilty that multiple entries under same tag name can exist. Ex:...

    That is not XML.

    In case 'B' it returns me an array of hashes of B's data.

    What is it? Please write a complete, self-contained program that demonstrates what you get, and how it differes from what you want ( see How (Not) To Ask A Question ).

      That is not XML.

      I'm not quite sure how to interpret the multiple entries under same tag name can exist.. On the other hand tags can repeat an even values within tags can repeat (though there are better solutions). In both cases it is well formed XML.

      With minor edits the example provided is well formed XML.

      <?xml version="1.0" encoding="UTF-8"?> <GEN_FSM xsi:noNamespaceSchemaLocation="Desktop/test.xsd" xmlns:xsi="h +ttp://www.w3.org/2001/XMLSchema-instance"> <State> <Name>POWERED</Name> <Comment>USB POWERED State</Comment> </State> <State> <Name>DEFAULT</Name> <Comment>USB DEFAULT State</Comment> </State> <State> <Name>ADDRESS</Name> <Comment>USB ADDRESS State</Comment> </State> <Rst_Arc> <To_State>DEFAULT</To_State> <monitor_mode_event_name>HOT_RST</monitor_mode_event_name> <monitor_mode_event_count>1</monitor_mode_event_count> <comment>HOT_reset_to_DEFAULT</comment> </Rst_Arc> </GEN_FSM>

      It can even be validated by a schema (auto generated):

      <?xml version="1.0" encoding="UTF-8"?> <xsd:schema elementFormDefault="qualified" xmlns:xsd="http://www.w3.or +g/2001/XMLSchema"> <xsd:element name="GEN_FSM"> <xsd:complexType> <xsd:choice minOccurs="0" maxOccurs="unbounded"> <xsd:element ref="State"/> <xsd:element ref="Rst_Arc"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="State"> <xsd:complexType mixed="true"> <xsd:choice minOccurs="0" maxOccurs="unbounded"> <xsd:element ref="Name"/> <xsd:element ref="Comment"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="Comment" type="xsd:string"/> <xsd:element name="Rst_Arc"> <xsd:complexType mixed="true"> <xsd:choice minOccurs="0" maxOccurs="unbounded"> <xsd:element ref="To_State"/> <xsd:element ref="monitor_mode_event_name"/> <xsd:element ref="monitor_mode_event_count"/> <xsd:element ref="comment"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="To_State" type="xsd:string"/> <xsd:element name="monitor_mode_event_name" type="xsd:string"/> <xsd:element name="monitor_mode_event_count" type="xsd:integer"/> <xsd:element name="comment" type="xsd:string"/> </xsd:schema>

      Cheers

      Harry