in reply to Re: XML::Simple parsing into a hash wierd behaviour
in thread XML::Simple parsing into a hash wierd behaviour

At one point or another you need to know the structure of the XML. You may give some of that info to the parser and obtain a simplified structure or give it none and obtain a very generic structure, most probably containing a lot of information you do not really need. And, in some cases, need to more or less explicitely ignore or strip. Parsing the XML is just the first step, it may be a short or a longer one.

Jenda
Enoch was right!
Enjoy the last years of Rome.

  • Comment on Re^2: XML::Simple parsing into a hash wierd behaviour

Replies are listed 'Best First'.
Re^3: XML::Simple parsing into a hash wierd behaviour
by ikegami (Patriarch) on Apr 20, 2010 at 15:56 UTC

    I don't disagree with the principle, but the devil is in the details.

    • XML::Simple defaults to unsafe behaviour.

    • The whole idea of doing a little work up front to save work later on just doesn't pan out in my experience with XML::Simple. I've already exposed this myth.

    • Simplifying the tree sounds good, but it all it really does is make XML::Simple useable. Alternatives have query mechanisms that allow one to jump around the tree as easily.

    And then there are the limitations of XML::Simple.
    • XML::Simple handles namespace VERY poorly

      • It fails if different documents use different prefixes. (Prefixes are arbitrary.)
      • It fails if different documents use different means of specifying the namespace of a given node. (Prefix vs explicit xmlns vs inherited xmlns)

      This defect can be fixed.

    • It can only handle some XML formats.

      • It can't parse formats where one needs to know the order of differently named nodes.
      • It can't generate XML for formats where the order of differently named nodes is relevant or specified.
      • It can't handle formats that intermix text and element nodes (e.g. XHTML).

      This limitation is intrinsic to the design and cannot be fixed.

      I do not use XML::Simple myself, now that I've written XML::Rules, but it worked fine for me. Most probably because I was working within the limits of what it was designed for. Of course I had to add a few ForceArrays, but it still was the easiest solution. Because I did not need no namespaces, I did not intend to work with document oriented XML, I did not have to handle any optional attributes, ...

      Query mechanisms have two problems ... it's yet another language to learn and debug and it's slow. Compared to navigating a trimmed down data structure, navigating a generic maze of objects must be slow. Of course if you need to navigate far over a complicated path, then the query mechanisms may very well be easier. They probably will. If on the other hand you need to process pretty much everything, your query mechanisms will not help you much. The ->nodeValue()s and ->getAttribute()s all over the place will hurt though.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

        Query mechanisms have two problems ... it's yet another language to learn

        I'd rather learn

        Persons/Person
        then learning to do
        GroupTags => { Persons => 'Person' }, ForceArray => [qw( Person )], my @persons = $parent->{Persons} ? @{ $parent->{Persons} } : ();

        and it's slow.

        That's not the case. XML::Simple is still *far* slower.

        The ->nodeValue()s and ->getAttribute()s all over the place will hurt though.

        I have no problem with a little more wordiness in a series of cut and paste lines if all trickiness goes away in exchange.