Looks like XML::Simple gets a lot of bad press here. I guess it stems from the fact that most people fail to read the docs even if short. Let's see, what are the problems with XML::Simple.
First, the data structure it produces is not always consistent. Eg. for XML like this:
<root>
<tag>
<sub>foo</sub>
</tag>
<tag>
<sub>bar</sub>
<sub>baz</sub>
</tag>
</root>
the <sub> is once converted to a scalar and second time to an array of scalars. Big deal! Here comes ForceArray=>[qw(list of tags that may be repeated)].
Next problem is that it's a bit too aggressice in trying to help you with transforming
<root>
<tag>
<name>foo</name>
<value>475</value>
</tag>
<tag>
<name>bar</name>
<value>147</value>
</tag>
</root>
to
{
'tag' => {
'bar' => {
'value' => '147'
},
'foo' => {
'value' => '475'
}
}
}
Again, huge deal, READ THE DOCS and set KeyAttr => [] or to whatever list of tags/attributes you do want to fold on.
There is a problem though that has not been adequately handled in XML::Simple yet though. The inconsistency of
<root>
<tag>content only</tag>
<tag attr="1">and content</tag>
</root>
If you have a tag that has only optional attributes and it sometimes has and somethimes doesn't have the attributes it's harder than necessary to find out the content. You have to use ref() to see whether the <tag> produced a scalar or a hashref. There is an option that can force XML::Simple to always produce the hashref, but it applies to all tags, not just those few that it makes sense for. It's not actually that hard to implement so that it supports the same kind of settings as ForceArray. I just did that and will send a patch to the module maintaner shortly.
So all you have to do to get a nice, clean, consistent minimal datastructure out of the XML is to set ForceArray, KeyAttr and ForceContent accordingly. Big deal.
Besides you can infer the tags that need the ForceArray and ForceContent from the example XMLs, the DTD or the Schema. I actually already have the inferring from example XMLs for my XML::Rules done and it's trivila to change it to produce the options in the XML::Simple format. The upcomming version of XML::Rules will contain functions that'll for inferring these options from examples and DTDs for both.
P.S.: Sporti69, you may of course consider using my XML::Rules instead, with a little more work it can give you a more streamlined or even filtered and tweaked datastructure and would allow you to process the XML in chunks instead of loading everything into memory first and only then giving you a chance to process anything.
P.P.S.: I did not discuss one "problem" of XML::Simple, it doesn't preserve the order of child tags. When was the last time you needed that when extracting data from a data-oriented XML? That information would just waste memory and possibly complicate the access in such applications. Of course it means that XML::Simple is not suited for document-oriented XML and for modifying XML that's supposed to be used by a more strict application. If you need that, use a different module.
|