in reply to Going between XML and CGi by way of DTD

Although it might seem logical at first glance, it's not possible to map between the two systems.

To be more specific, while it's possible to map every possible output from a CGI form to an XML DTD, it is not possible the other way around.

You could write a mapping between a particular DTD/Schema and the relevant CGI script/form (which would have to include validation), but you couldn't generalise. And the DTD/Schema would have to have certain characteristics in order for it to work.



Nobody says perl looks like line-noise any more
kids today don't know what line-noise IS ...
  • Comment on Re: Going between XML and CGi by way of DTD

Replies are listed 'Best First'.
Re^2: Going between XML and Cgi by way of DTD
by throop (Chaplain) on Aug 26, 2007 at 20:58 UTC
    What specs in a DTD do you see as being particular problems?

    I'd settle for a restricted subset - either demanding that DTD/Schema conform to some limitations or having the code ignore the problematic parts of the DTD.

    Here's the background: the customer said "Write a set of CGI scripts in support of performing code inspections. Some of the information that you'll need is already sitting in a read-only 'wild' XML file (i.e. no DTD or equiv). Store the information you gather into another XML file. Periodically, that XML file will be checked into a configuration management system." The customer supplied an XML fragment (annotated) as a rough spec of what he had in mind. There are existing scripts in the system that offer futher guidance. They work, but they are old and somewhat crufty (eg, they generated ill-formed HTML that loads, nonetheless.)

    I generated a few scripts that did part of the job - essentially an editor for a CodeInspection. I wrote subroutines producing HTML to add/deleted sub-entities from entities ("Add or a delete a Reviewer", "Add or delete a line-which-exemplifies-a-bug.") I initially cobbled together a bastardized HoH to store the meta-data about the variables. Realized that was silly—why make up a formalism for storing meta-data when DTD/Schema/RelaxNG all do that already? So I stored the meta-data in a .dtd.

    I've already written a few small functions that look at the DTD and which generate the code for for editing/adding/deleting attributes and indentured entities. As I was doing the work, I thought "Wait, somebody's probably done this already." So I went searching in CPAN and asking here.

    Since I haven't found anything, I'm willing to believe that nobody's done this–or at least not released it. But I don't understand your warning that it would be generally impossible to do.

    thanks
    throop

      OK, I should have known I'd be challenged.

      Here's my thinking. Although XML is often used for traditional two-dimensional row-and-column data, it doesn't have to be. It can have arbitrary levels of nestedness, extra dimensions.

      So if you take this XML

      <items> <item> <foo>a</foo> <bar>b</bar> <baz>c</baz> </item> <item> <foo>x</foo> <bar>y</bar> <baz>z</baz> </item> </items>
      which follows this DTD:
      <!ELEMENT bar (#PCDATA)> <!ELEMENT baz (#PCDATA)> <!ELEMENT foo (#PCDATA)> <!ELEMENT item (foo, bar, baz)> <!ELEMENT items (item+)>
      Yes, it's pretty much plain sailing to generate the HTML.

      But what about this?

      <items> <item> <foo>a</foo> <bar>b</bar> <baz>c</baz> <item> <!-- items can contain sub-items --> <foo>d</foo> <bar>e</bar> <baz>f</baz> </item> </item> <item> <foo>x</foo> <bar>y</bar> <baz>z</baz> </item> </items>
      Which matches the DTD with one small change:
      <!ELEMENT bar (#PCDATA)> <!ELEMENT baz (#PCDATA)> <!ELEMENT foo (#PCDATA)> <!ELEMENT item (foo, bar, baz, item?)> <!ELEMENT items (item+)>

      What's my HTML form going to look like when any given <item> element can contain an arbitrary number of other <item> elements?



      Nobody says perl looks like line-noise any more
      kids today don't know what line-noise IS ...
        There are two intertwined challenges here: (1) Generating an display/editing form that corresponds precisely to the XML and the DTD. (2) Rendering this in a way that it makes sense visually to the user.

        I think I can do that for your example:

        <html><body> <table border=1 cellspacing=0 cellpadding=3> <caption>Items</caption> <tr><td rowspan=6>Item1</td><td colspan=2>foo:</td><td>a</td></tr> <tr><td colspan=2>bar:</td><td>b</td></tr> <tr><td colspan=2>baz:</td><td>c</td></tr> <tr><td rowspan=3>Item2</td><td>foo:</td><td>d</td></tr> <tr><td>bar:</td><td>e</td></tr> <tr><td>baz:</td><td>f</td></tr> <tr><td rowspan=3>Item3</td> <td colspan=2>foo: </td><td>x</td></tr> <tr><td colspan=2>bar:</td><td>y</td></tr> <tr><td colspan=2>baz:</td><td>z</td></tr> </body></html>
        (If you download this into a scratch.html file, and view the result in your browser, you'll see better what I'm doing here.)

        Some tricks here - you've got to traverse the whole tree before you start generating the table – you've got to know what the maximum depth is. You make the table that many columns wide. And you have to know how many leaf elements there are within each item, so you know how many rows tall to make its block.

        I've previously generated HTML tables to represent structures nested 10 levels deep, with on the order of 200 elements.

        The nesting is a bit of a problem, but one I've solved previously.