in reply to Re: Datastructures to XML
in thread Datastructures to XML

Who said anything about records? Who said the datastructure is gonna be AoH? I would like to support any kind of datastructure. And be able to produce even more complex (read crazy) XML. I think you'd find this style getting quickly out of hand as soon as you attempted to support AoA,HoA,HoH,HoHoH,HoAoH, ... or as soon as you needed to produce things like

<record id="1"> <foo>Hello</foo> <bar>World</bar> </record>
OK, so you add a way to specify which "field" is gonna become an attribute of the record tag. Then you find out you sometimes need more. That sometimes the name of the field doesn't match the name of the attribute. ...

For the simpler task of converting AoH to (more or less) record based XML your solution is probably simpler. Whether easier to use I'm not so sure. The templates as I see them, let the user specify how does he/she want the result to look like and then mark what is to be repeated for the A and what for the H, where to put the key and where the value from the hash, specify the static tags or data, etc.

Thanks for the comment anyway of course, I actually think your module might be a nice little addition to CPAN. Or maybe it could be added to XML::Records or XML::RAX. As a means to go the other direction than what the modules were originally made for.

Replies are listed 'Best First'.
Re^3: Datastructures to XML
by ELISHEVA (Prior) on Mar 18, 2009 at 02:22 UTC
    Who said anything about records? Who said the datastructure is gonna be AoH?

    Well,your question began: how would you convert this data, and this data was AoH. And you also stated that your inspiration for the meditation was SOPW that was also about an AoH. Conceptually, many people think of an AoH as an array of records, hence the terminology "record".

    I would like to support any kind of datastructure.

    That is an excellent goal, but it sits in tension with the goal of "making it easy" - as you observe about writing documentation when you responded to zentara. Much of the appeal of templating solutions comes from the fact that it is sort of WYSIWYG - there is less guess work in what the output will be. However, this benefit usually lessens when templating languages try to add more and more features to handle various edge cases. As extra syntax accumulates the template begins to look less and less like the actual output.

    I think you'd find this style getting quickly out of hand as soon as you attempted to support AoA,HoA,HoH,HoHoH,HoAoH, ... or as soon as you needed to produce things like...

    If you want to handle more complex data structures, it can be done without option keys multiplying like rabbits or creating hundreds of format constants. The key is to understand the meta structure of the problem. As with the templating approach, you need to do two things:

    • Provide a way to handle H, A, AoA, and AoH
    • Provide a way to handle array elements and hash values that are non-scalar: references to H, A, AoA, AoH, and blessed objects

    With those two ingredients you can handle data structures of any complexity - whether you are using a functional approach or a templating approach. Handling fields that have non-scalar values was briefly discussed (though perhaps not very clearly) in my note above in the paragraph about adding support for recursion and a option hash key that stored a value representation rule: a hash reference storing option hashes keyed by field name or regex. This is really little more than a change in representation from the templating approach: the regex that appears in your template becomes the hash key; the written out example XML becomes the option hash assigned to the regex.

    As for H,A,AoH,AoA. AoH is already handled. H is equivalent to AoH with one element, so modifying genRecord(...) to work with H rather than AoH is trivial. A is also equivalent to AoA with one element. So that leaves AoA and objects. To support AoA, you would need to deal with two scenarios: (a) indexes get mapped to names (b) each array element gets mapped to a nested element that differs only in the value assigned to it. (a) can be handled by modifying the filter function to return the field name rather than just a boolean value. Alternatively one could add a option hash key "indexToName" that has an array reference listing the field names in order. (b) can be handled by adding support for two additional format constants: NO_NAME_ATTR_VAL, NO_NAME_TEXT_VAL

    Blessed objects raise other issues: (a) is the object opaque or can you just extract the data associated with the underlying blessed reference? (b) if the object is opaque, one needs to identify which methods should be used as getter methods. Whether one adopts a functional or templating approach one will still need to find a way to provide the same information: (opaque or not/which methods are getters).

    OK, so you add a way to specify which "field" is gonna become an attribute of the record tag. Then you find out you sometimes need more. That sometimes the name of the field doesn't match the name of the attribute. ...

    Some people would handle things that way, but I wouldn't. The functional approach I described actually does allow one to put "fields" in as attributes already as well as a number of other XML syntax variations. Check the format choices for details. What it didn't allow you to do is rename "fields" or pick and choose which fields are record attributes and which are nested elements.

    However, providing support would require (for the user) no more than a minor modification to the filter function. Instead of returning a simple boolean, the filter function would return:

    • undef if the field should be skipped
    • the name of the field if the field should be included in the record. If only a name is returned the field will be a record attribute or nested element according to the option hash passed into genRecord(...)
    • a reference to an array or hash containing two bits of data: the field name and an option hash that overrides the one passed into genRecord(...) but only for that particular field.

    I would consider a souped up filter function a better choice than the regex being used in the template spec because the logic involved in the choosing field names and field placement (attribute/nested element) may not be reducible to a regex.

    For the simpler task of converting AoH to (more or less) record based XML your solution is probably simpler. Whether easier to use I'm not so sure

    Each person has their own style and you (and probably many others) may simply prefer templating. I find templating approaches more limiting and "less easy", mainly because (a) even when there are obvious defaults, one still needs to spell out everything in a template - one could say it lacks Huffman encoding. (b) when I really do need to do fancy things like treat some fields as record attributes and some as nested elements (or rename fields) my logic may not be reducible to a regex. A filter function gives me the full power of Perl, including closures. (c) the implementation is more re-usuable. I can always layer a template language and parser over the functional approach. But I can also experiment with other ease of use interfaces.

    Best, beth

    Update: moved discussion of WYSIWYG and templating to start of post.