in reply to Re: Datastructures to XML
in thread Datastructures to XML

Re the database example: I would consider this an implementation detail. It doesn't really make a difference whether you know the address of another object (in the general sense) in memory or its ID within some collection. Or (in case of N-N relations) a list of pairs of IDs of objects.

If I could store the data in any format I wished, it would be much easier, but sometimes I do not control the format. And sometimes even if I do it doesn't map to the data structure that best suits the needs of the task at hand directly.

I think I should have explained better what I am really after. I'd like to have a "reverse to XML::Rules". That is with XML::Rules I can tweak the tree structure of the data from a XML so that I can work more easily with the resulting structure. Where the original structure may be designed with a different task in mind or just be more general. Then I would like to have some reasonably simple way to "convert" the datastructure back to the original format. Or for that matter to a different format, but quite often one that was not designed for this particular task.

Replies are listed 'Best First'.
Re^3: Datastructures to XML
by ELISHEVA (Prior) on Mar 19, 2009 at 08:44 UTC

    It may be an implementation detail, but it is a very important one that can have a significant impact on a general purpose "data structure to XML" converter.

    Sometimes data structures constructed in memory use pointers in place of ids. When this is converted to a persistent form (XML or otherwise), one must create some sort of id that corresponds to the pointer or reference. One will also need to decide on a name for the tag or attribute that holds the generated id since there is no corresponding array or hash element to "foreach". Otherwise there will be information loss.

    In some cases one can just assign sequential ids. Other formats might require GUID generation. Others might want a registered URI. Still other XML formats require that the id match something in a database or flat file. To get the right id one might need to do a look up on a "soft" id - for example a person's first and last name or their social security or passport number. Or one might need to add a new record to the database and capture the id assigned by the database.

    A second issue that I think sundialsvc4 was getting at was placement of XML elements. Both your template spec (and my functional alternative) assume a part-container model: elements nested within elements.

    But sundialsvc4 is reminding us of an extremely important and common alternative: the relational model. In the relational model, big ugly objects aren't nested. They are replaced by foreign key fields. The XML for the big-ugly-object is defined elsewhere, perhaps even in a different file. The two may be connected either by matching field values (a la a relational DBMS) or by "references" - the value assigned to the id attribute of the big-ugly-object-in-another-file.

    Because part-container models are easier to conceptualize, XML schemas often start life using a part-container model and then migrate over time to one that supports more of a relational model (less duplication of big-ugly-objects). For a readily available open source example, study the history of the XML format used with the ant build tool. Incidentally, the history of DBMS implementation also follows this progression (anybody remember CISC-ISAM databases?)

    Any general purpose tool would be wise to support both (or clearly explain its limits in the CAVEATS section of its POD). Otherwise a company using Mondo::Wonderous::Data::XML might find that they have to throw out, rather than modify, their XML generation code as their XML schemas mature.

    Best, beth

Re^3: Datastructures to XML
by locked_user sundialsvc4 (Abbot) on Mar 18, 2009 at 22:35 UTC

    It is not, strictly speaking, “an implementation detail.” When you are dealing with external data collections (be they SQL tables, or XML files or whatever), the notion of “addresses” (hence: references) does not exist. The notion of “keys,” of whatever format you wish, does.

    If you have ever had the unfortunate experience of dealing with an IMAGE or an IDMS database in any past-life you'd much rather forget, then you will know exactly what I am talking about...   :-D

    (You do not, of course, have to answer that. Many I.S. memories are much better left buried in the past.)