TienLung's question in Arrays and Hashes woes regarding a way to modify some XML and my response using XML::Rules to simplify the datastructure upon reading it and the ugly code to desimplify the datastructure so that it can be converted back to a XML with the same format, got me thinking. Is there a nicer way to produce a XML with a desired format (or schema if you will) from a datastructure that contains all the data you need, but is not structured exactly a as the XML, doesn't contain all the tag names, doesn't distinguish between attributes and tag content etc?

What would you do if you wanted to transform

[ { 'lname' => 'Krynicky', 'fname' => 'Jenda' }, { 'Site' => 'PerlMonks', 'Nick' => 'Jenda' } ];
to
<root> <page> <field> <ID>fname</ID> <value>Jenda</value> </field> <field> <ID>lname</ID> <value>Krynicky</value> </field> </page> <page> <field> <ID>Site</ID> <value>PerlMonks</value> </field> <field> <ID>Nick</ID> <value>Jenda</value> </field> </page> </root>
? You can assume the order of the <field> tags and the child tags of <field> doesn't matter.

In Re: Arrays and Hashes woes I used two nested map()s with some anonymous arrays and hashes to convert the datastructure to one that would end up looking like this if printed by XML::Rules, but it's ... not a light reading. What would you use?

I was considering a (yes, yet another) template engine (my second actually, the first one was for RTF) geared at this kind of conversion, without the need for complex XPath expressions etc. Here are some examples of what I think it might look like:

Assuming this data structure:

[ { PageId => 1, Name => 'Civil name', 'lname' => 'Krynicky', 'fname' +=> 'Jenda' }, { PageId => 2, Name => 'Online identity', 'Site' => 'PerlMonks', 'Ni +ck' => 'Jenda' } ];
TemplateResult (reformated to better fit the table)
<root xmlns:cmd="http://jenda.krynicky.cz/XML/Rules/Template/Cmd" xmlns:set="http://jenda.krynicky.cz/XML/Rules/Template/Set" xmlns:opt="http://jenda.krynicky.cz/XML/Rules/Template/Opt"> <page cmd:foreach> <field cmd:foreachkey> <ID set:_content="$key"></ID> <value set:_content="$value"></value> </field> </page> </root>
<root> <page> <field><ID>PageId</ID><value>1</value></field> <field><ID>Name</ID><value>Civil name</value></field> <field><ID>lname</ID><value>Krynicky</value></field> <field><ID>fname</ID><value>Jenda</value></field> </page> <page> <field><ID>PageId</ID><value>2</value></field> <field><ID>Name</ID><value>Online identity</value></field> <field><ID>Site</ID><value>PerlMonks</value></field> <field><ID>Nick</ID><value>Jenda</value></field> </page> </root>
<root xmlns:...> <page cmd:foreach set:id="$_->{PageId}"> <field cmd:foreachkey="! /^PageId$/"> <ID set:_content="$key"></ID> <value set:_content="$value"></value> </field> </page> </root>
<root> <page id="1"> <field><ID>Name</ID><value>Civil name</value></field> <field><ID>lname</ID><value>Krynicky</value></field> <field><ID>fname</ID><value>Jenda</value></field> </page> <page id="2"> <field><ID>Name</ID><value>Online identity</value></field> <field><ID>Site</ID><value>PerlMonks</value></field> <field><ID>Nick</ID><value>Jenda</value></field> </page> </root>
<root> <page cmd:foreach set:id="$_->{PageId}"> <field cmd:foreachkey="! /^PageId$/"> <ID set:_content="$key"></ID> <value set:_content="$value" opt:optional></value> <!-- this means that the tag is skipped if the $value is undef or empty string --> </field> </page> </root>
Same as above in this case.
<root> <page cmd:foreach set:id="$_->{PageId}" opt:required> <field cmd:foreachkey="! /^PageId$/"> <ID set:_content="$key"></ID> <value set:_content="$value" opt:if="defined $value"></value> <!-- this means that the tag is skipped only if the value is undef --> </field> </page> </root>
Same as above in this case.
<root> <page cmd:foreach set:id="$_->{PageId}" opt:required> <!-- the tag would be required even if the datastructure was empty - +-> <field cmd:foreachkey="! /^PageId$/"> <cmd:insert set:_tag="$key" set:_content="$value"/> </field> </page> </root>
<root> <page id="1"> <Name>Civil name</Name> <lname>Krynicky</lname> <fname>Jenda</fname> </page> <page id="2"> <Name>Online identity</Name> <Site>PerlMonks</Site> <Nick>Jenda</Nick> </page> </root>
<root> <page cmd:foreach set:id="$_->{PageId}"> <name cmd:forkey="Name" set:_content="$_" opt:required/> <field cmd:foreachkey="! /^(PageId|Name)$/"> <ID set:_content="$key"></ID> <value set:_content="$value"></value> </field> </page> </root>
<root> <page id="1"> <Name>Civil name</Name> <field><ID>lname</ID><value>Krynicky</value></field> <field><ID>fname</ID><value>Jenda</value></field> </page> <page id="2"> <Name>Online identity</Name> <field><ID>Site</ID><value>PerlMonks</value></field> <field><ID>Nick</ID><value>Jenda</value></field> </page> </root>

Here's a quick and dirty explanation of the commands and options:

cmd:
(attributes)
foreach = if the $_ is an array ref, repeat that tag and for the children set $_ to the elements of the array
foreachkey = if the $_ is a hash ref, repeat that tag and for the children set $key to the keys of the hash and $value to the values
  - if the attribute has any value, then it's used as a condition and all elements that do not match the condition are skipped
forkey = if the $_ is a hash ref, evaluate the tag for all keys returned by the code in this attribute and set the $_ to the values
  it the $_ is an array ref, evaluate the tag for all elements with IDs returned by the code, set $_ to the elements
forkeys = if the $_ is a hash ref, evaluate the tag for all keys returned by the code in this attribute and set the $key to the keys and $value to the values

(tags)
insert = evaluate as if it was any other tag, but then either print the tag with the name specified by set:_tag or only the content and/or children
set:
(attributes)
_content : evaluate the expression and put the result into content, if the tag has children than the content is output first
<anything else> : evaluate the expression and set that attribute
  in a cmr:foreach or cmr:foreachkey marked tag evaluate the values for the attributes with $_/$key+$value set as for the child tags
opt:
required - used for tags with cmd:foreach or cmd:foreachkey, the tag will be printed even if there are no (matching) elements
optional - the tag is printed only if at least one set:attribute or set:_content produced some text
if - the tag and its children are printed only if the condition holds
unless - the tag and its children are printed only if the condition doesn't hold

Did I go crazy? Or does it remind you of something that's already implemented? Or do you actually think this could be worth implementing? And have some suggestions for additional features ...

Update on Mar 18, 2009 at 02:50 GMT-2: I would like to have a more general solution than just for a AoH as presented in the example. I'd like to support any level of any combination of arrays and hashes, even irregular. Array of hashes that contain keys whose values are scalars, others pointing to arrayrefs, yet others to hashrefs, ... Assuming the structure is consistent and well known, but possibly complex.


In reply to Datastructures to XML by Jenda

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.