The problem with XML::Simple is that unless you fiddle with ForceArray and ForceContent the resulting data structure is not consistent. If some tag sometimes has text content and attributes and sometimes only the content, you get a hash once and a scalar later. If some tag is repeated within another tag once, but occurs only once the other time, you get array of hashes/scalars the first time and one hash/scalar the second.

If you know your data you can set the XML::Simple's options accordingly. Or you can ask XML::Rules to infer the rules from either the DTD or a (few) example(s) and obtain a consistent datastructure almost identic to the one created by a well set XML::Simple.

How effective are Rules with large documents depends on the rules. That's what specifies whether you keep all the data from the document or whether you filter the bits you do not need as you go or process parts of the XML and forget the data you no longer need.


In reply to Re^3: Why oh why is working with XML so bloomin' difficult in Perl? by Jenda
in thread Modified title: The structures created by many of the XML parsers in Perl appear unnecessarily deep in levels... by jfroebe

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.