Moose, the tree structure of XML and object-oriented inheritance hiearchies

Moose has a very nice method modifier called augment that allows one to use OO inheritance to separate general and specific XML rendering. For instance, if all of your XML must have certain outer wrapper tags, then your base class method can build that XML and call inner(). Using XML::Generator your base class would be:

      QBXML(
        QBXMLMsgsRq( {onError => "stopOnError"},
                 inner()
                 ));
[download]

And a specific inner class to build inner XML might be:

augment 'as_xml' => sub {
    my ($self, $name)=@_;

      VendorAddRq(
      VendorAdd(
          Name($name)));
};
[download]

Well, that's a very nice start on taking the directed-graph nature of XML and layering it on the directed-graph nature of class-based oo inheritance.

problem statement

Ok, so what happens when there are numerous optional, nested child nodes? For instance, the code above generates the general wrapper and specific code for the required fields of a Quickbooks VendorAdd request:

  <?xml version="1.0" encoding="utf-8"?>
  <?qbxml version="10.0"?>
  <QBXML>
    <QBXMLMsgsRq onError="stopOnError"> 
      <VendorAddRq> 
    <VendorAdd> <!-- required --> 
      <Name >STRTYPE</Name> <!-- required -->
    </VendorAdd>
      </VendorAddRq>
    </QBXMLMsgsRq>
  </QBXML>
[download]

</CODE>

but as you can see, there are numerous optional nodes that are children or grandchildren of the VendorAdd element.

  <?xml version="1.0" encoding="utf-8"?>
  <?qbxml version="10.0"?>
  <QBXML>
    <QBXMLMsgsRq onError="stopOnError"> 
      <VendorAddRq> 
    <VendorAdd> <!-- required --> 
      <Name >STRTYPE</Name> <!-- required -->
      <IsActive >BOOLTYPE</IsActive> <!-- optional -->
      <CompanyName >STRTYPE</CompanyName> <!-- optional -->
      <Salutation >STRTYPE</Salutation> <!-- optional -->
      <FirstName >STRTYPE</FirstName> <!-- optional -->
      <MiddleName >STRTYPE</MiddleName> <!-- optional -->
      <LastName >STRTYPE</LastName> <!-- optional -->
      <VendorAddress> <!-- optional --> 
        <Addr1 >STRTYPE</Addr1> <!-- optional -->
        <Addr2 >STRTYPE</Addr2> <!-- optional -->
        <Addr3 >STRTYPE</Addr3> <!-- optional -->
        <Addr4 >STRTYPE</Addr4> <!-- optional -->
        <Addr5 >STRTYPE</Addr5> <!-- optional -->
        <City >STRTYPE</City> <!-- optional -->
        <State >STRTYPE</State> <!-- optional -->
        <PostalCode >STRTYPE</PostalCode> <!-- optional -->
           </VendorAddress>
    </VendorAdd>
      </VendorAddRq>
    </QBXMLMsgsRq>
  </QBXML>
[download]

so the question is

How do you handle conditional, nested generation of XML using object-oriented mechanisms? As I write this, it seems the best way is for these various children to be various Roles (Moose::Role) which you can can call to build the various children, all of which know where in the tree to place themselves, and only place themselves there if they were constructed with renderable data. Ie, if they were null, then they simply dont render.

Practically, if you pull a row from the database, then you want to throw that row at a series of constructors, heedless of whether the values or defined or not and only have the instances render in the tree is the particular data column were defined

CPAN

XML::Toolkit

XML::Toolkit does parse a sample XML file and turn it into a bunch of classes. It is much more typing to build XML than with XML::Generator and it does not appear to subclass the parsed XML into an inheritance tree mimicking the XML structure --- it just makes a class per XML element.

So, any conditional rendering based on data would require tortuous conditionals.

XML::Rabbit

XML::Rabbit is a very neat idea for *consuming* XML. It uses XPath so that each attribute knows where to get itself from. So, if XML::Toolkit had parsed the XML and created a bunch of attributes with XPath saying where it belonged, then you could simply instantiate an object with a hashref of data and only the defined attributes would go about rendering themselves!

XML::Writer::Nest

XML::Writer::Nest is an extension of XML::Writer that basically allows for nested XML to be created by leveraging the automatic destructor call for scalars leaving scope. Thus for the generating XML similar to above we would have:

{ my $VendorAddRq = XML::Writer::Nest->new(tag => 'VendorAddRq');

   { my $VendorAdd = $VendorAddRq->nest('VendorAdd');
   
     { my $Name = $VendorAdd->data(Name => $name) }
  }
}
[download]

And then for optional nested nodes, we would have nested code which would optionally create objects:

   { my $VendorAdd = $VendorAddRq->nest('VendorAdd');
   
     { my $Name = $VendorAdd->data(Name => $name) }
     { my $IsActive = $row->{active} ? $VendorAdd->data(IsActive => 1)
+ : 0 }
  }
[download]

What happens is the Inactive node gets rendered as a function of whether or not the incoming hashref has active set. If it's set, then $IsActive will be an object that is designed to render nested XML. Otherwise, $IsActive is a normal scalar and no XML will be rendered.

XML::Generator

XML::Generator code would be similar. But each conditional rendering would have to be a subroutine which returns empty string or auto-generates XML:

      VendorAddRq(
      VendorAdd(
          Name($name),
              maybeRenderIsActive($row)
           )
       )
[download]

SUMMARY

perhaps the direct approach of XML::Generator and XML::Writer::Nest is all that is needed? They certainly have a lot less requirements in terms of understanding heavy-duty Moose concepts like Traits. But maybe there a good wedding between the tree that is XML and the tree that is a hiearchy of Moose objects?

For configurable software construction, the problem with XML::Generator and XML::Writer::Nest is separating XML construction from the boolean function to decide whether to render. In other words, each user of your nested optional XML "specification" should be able to hook in their desired subroutines for deciding which nodes should render. Thus, the above XML::Generator really should look like this:


augment 'as_xml' => sub {
    my ($self, $name, $optionaldata)=@_;

      VendorAddRq(
      VendorAdd(
          Name($name),
              $self->logic_engine->maybeRenderIsActive(@_) # just give
+ it the whole indata
           )
       )
[download]

And the programmer can override the default logic_engine with methods based on the data and business requirements. For instance, the default logic_engine might look like this:


package XML::Quickbooks::LogicEngine;

sub IsActive {
  my ($self, $name, $databaserow)=@_;

  $databaserow->{active};
}

1;
[download]

And then a developer has the choice of mapping his hashref to the same values as the logic engine or supplying his own logic engine.

The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development.

-- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"

Comment on Moose, the tree structure of XML and object-oriented inheritance hiearchies Select or Download Code

Replies are listed 'Best First'.
Re: Moose, the tree structure of XML and object-oriented inheritance hiearchies by perigrin (Sexton) on Jun 20, 2011 at 20:26 UTC
Hey as the author of XML::Toolkit I thought I should reply to what you mentioned on RIC and what you mentioned here. XML::Toolkit does parse a sample XML file and turn it into a bunch of classes. It is much more typing to build XML than with XML::Generator and it does not appear to subclass the parsed XML into an inheritance tree mimicking the XML structure — it just makes a class per XML element. So, any conditional rendering based on data would require tortuous conditionals. My experience is that parsing XML into an object model is tricky. The XML InfoSet may look relatively straight forward but really it isn’t. You mention here that XML::Toolkit doesn’t create an inheritance tree mimicking the XML structure and this is intentional. It leads to very brittle parsers. Take for example a document like the following: `<body> <div id="content"> <div id="bacon"> Bacon ipsum dolor sit amet pancetta jerky tail pork stri +p steak, t-bone meatloaf salami ham chicken drumstick ball tip short +loin ham hock jowl. </div> </div> </body>` [download] How exactly do you model the `<div>` elements there? Do you have a single class that is a subclass of itself? Do you generate unique Div#Content and Div#Bacon classes? XML::Toolkit decides to go with “neither” and rather than a isa-relationship it uses a has-a relationship. If you feed this snippet into XML::Toolkit you get back a single Div class that has-a `text` attribute and an optional `div` attribute. This is a closer mapping to the underlying idea of XML. XML::Toolkit’s approach falls down however when it comes to being more restrictive. XML::Toolkit doesn’t have proper schema support (yet!) and so it can’t know which elements are required and which are optional. It can’t even known if you are only allowed only one of a given element or many. So it guesses, and it always guesses that you’re allowed 0 or more of any element it sees during the generate step. Later when you go to create a new document from the generated classes, any attribute that is empty is suppressed in the output. So to generate the before mentioned example without the `#bacon` element `Body->new( div_collection => [ Div->new( id => 'content') ], );` [download] Which is a bit more typing than you were looking for, but is still pretty straight forward.	[reply] [d/l] [select]
Re: Moose, the tree structure of XML and object-oriented inheritance hiearchies by metaperl (Curate) on Jun 20, 2011 at 21:09 UTC
Just an update. I have conditional XML rendering working with XML::Generator like so: `augment 'as_xml' => sub { my ($self, $name, $opt)=@_; VendorAddRq( VendorAdd( Name($name), $self->maybeVendorTypeRef($name, $opt) )); }; sub maybeVendorTypeRef { my($self,$name,$opt)=@_; return unless $opt; return unless my $v = $opt->{VendorTypeRef}; VendorTypeRef( $self->hashrender(ListID => $v) , $self->hashrender(FullName => $v) ); } sub hashrender { my($self, $key, $hash)=@_; return '' unless my $v = $hash->{$key} // undef; my $X = XML::Generator->new(pretty => 2); $X->$key($v); }` [download] The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development. -- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"	[reply] [d/l]
Re: Moose, the tree structure of XML and object-oriented inheritance hiearchies by locked_user sundialsvc4 (Abbot) on Jun 23, 2011 at 13:48 UTC
“Nothing personal, but ...” my visceral reaction to this notion is like that of Baby Herman in Who Framed Roger Rabbit: “da whole thing stinks like yestahday’s diapahs!” Remember ... nothing personal. This is a frank comment, but not intended to sting, and strictly intended as purely an engineering critique. Yes, it’s true: an XML file is a nested, hierarchical data structure. But that does not mean that you should, by any means whatever, construct a procedural software construction to match it! I encourage you, instead, to view it as a database of sorts, and to use XPath expressions to traverse it ... to query it ... and, to focus precisely on those parts of it which concern you at a particular time. I am extremely skeptical of strategies which attempt to make, shall we say, excessive use of “modifiers” and “augmentations,” and/or which simply become too-complex in the pursuit of brevity. Above all other considerations, I always want to know precisely what a piece of source-code does, and, if I want to change it, I do not want that change to “ripple” ... most especially in ways that I cannot readily see. Another thing that we can always say about files, especially XML files, is that they evolve. When (not if ...) they do, what’s going to have to happen to your clever-but-delicate contrivance, and why? Presumably you are still asking the same questions and still doing the same things, but how much does the source-code have to change, and how pervasive is that change?
Re^2: Moose, the tree structure of XML and object-oriented inheritance hiearchies by metaperl (Curate) on Jun 27, 2011 at 14:30 UTC
Another thing that we can always say about files, especially XML files, is that they evolve. When (not if ...) they do, what�s going to have to happen to your clever-but-delicate contrivance, and why? Presumably you are still asking the same questions and still doing the same things, but how much does the source-code have to change, and how pervasive is that change? whose contrivance? what module are you referring to? perigrin's? well, with XML::Toolkit there are a set of classes to generate XML and similar to how a `::Loader` for database constructs classes for databases. So, your code itself would change very little because the classes are there to build the XML ... just re-run the loader. The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development. -- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"	[reply] [d/l]