rvosa has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

here's what I am trying to do: I'm running a MySQL database and a socket server script. Clients may connect to this server and send it data in an XML format. The socket server receives the XML and inserts the data contained in it in the right tables and columns in the database (I'm using DBI and XML::Twig). OK, so far so good - I got this to work.

Now, I want clients to be able to send queries to the socket server (these would essentially be SELECT statements though how these are wrapped up is still up in the air), the server retrieves whatever it is the client wants and sends that back, wrapped up in the same valid XML format as the data submissions come in. So that's the tricky bit. For now the results are sent back in generic <resultset><row/></resultset> tags but that's no good. I have looked into various modules (DBIx::XMLMessage, XML::XMLtoDBMS, I think there are others but I haven't looked into them) but I'm not sure which one to use. My understanding of DBIx::XMLMessage is that you have to define what kind of queries might be performed beforehand in various templates. I'm not sure if this is flexible enough. I like the mapping system for XML::XMLtoDBMS, but, again, I'm not sure how flexible the queries can be.

What I'm looking for is a way of doing this that would allow fairly complex queries joining multiple tables to be executed and the results to appear automagically in the proper nesting structure. Obviously, not every possible SQL query under the sun could be wrapped in the XML data format we're using (e.g. "DESCRIBE ..." and that sort of thing wouldn't be possible) so there's some limit to what can be retrieved this way, but I'm looking for something that goes at least a bit further than cases like SELECT employeename FROM employee WHERE emplyoeeID='34534'.

Please share your experiences with this kind of setup (it doesn't seem to be a hugely esoteric thing to do) so I can make an informed decision before I dive into something that may not do what I'm looking for.

Oh, I know this is a fairly vague and general question, sorry about that. I am still exploring how best to set set this system up so I'm looking for general suggestions for XML<=>DBMS, pros and cons.

Thanks.

Replies are listed 'Best First'.
Re: DBMS<=>XML
by diotalevi (Canon) on May 03, 2004 at 23:40 UTC
    I suspect you should share some examples of the variety of schemas you expect to handle. This problem is just going to devolve into a xml schema problem so if you ignore the DBI part you'll be off on a right foot.
      Here's an example of a submission of data:
      <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE cipresml SYSTEM "cipres.dtd"> <cipresml version="0.0.1" sessionID="1" status="0" user="rvosa" servic +e="Repository"> <submission> <taxa tBID="Taxablock"> <taxon tID="taxonID"> <name>taxon name</name> </taxon> </taxa> </submission> </cipresml>
      In this case there are three tables involved: the sessions table (which stores the sessionID, the username and a timestamp), the taxa table (which stores the tBID, and the sessionID as a foreign key) and the taxon table (which stores the tID, the name, the tBID and the sessionID - the last two are foreign keys referencing the taxa table).

      Here's an example of a query, in SQL, that I want to result in a structure like that descending from the <submission> element:
      SELECT name FROM taxon WHERE tBID='Taxablock' AND sessionID=1 AND tID= +'taxonID';

        Is that the only style of submission you'll get? You implied that your interpreter was going to have to intuit what the submission means. This may be a stupid question but can't you get a meaningful specification for the schemas you'll be expected to consume? From there you just note which schema style was retrieved and use an output reformatter that writes to that style. Also, have you looked at XQuery? I've heard of it in similar contexts except that it may be already designed to handle the problems you're just going to have to invent around.

      I think the w3c has a spec for sql and xml called xsql.
      There is also some perl related development going on with this
      see the following link...
      perl xsql info
        I think the w3c has a spec for sql and xml called xsql.

        Thanks for the heads up, looking into this now...
Re: DBMS<=>XML
by Errto (Vicar) on May 04, 2004 at 02:50 UTC

    Just wondering, why is <resultset><row/></resultset> no good? It seems to me that if all the client is expecting is simple tabular data (say, to be displayed to the end user as a simple table) then that ought to be good enough. Or at least, that's what I thought when I did something very similar at work recently (sadly not in Perl).

    I guess the thing is that if your client expects back highly structured data with specific meanings it's a different matter. I'm not too familiar with modules for this, but it seems like one good way might be to just store your XML templates somewhere (if they'll be fairly static), and use some kind of table to map the results from your query to individual locations (defined using XPath perhaps) in your XML template. I believe you could use XSLT for that too though I haven't tried.

      I guess the thing is that if your client expects back highly structured data with specific meanings it's a different matter.

      That's exactly the problem. Here's the dilemma: I could restrict the types of queries you can do to a subset where the results are neatly marked up (using a template) to get the data in a form that's more complex than <resultset><row/></resultset> but then I'm constrained in terms of the things you can ask, or I can allow for more flexibility in terms of the queries but then I can't really see how I could get output that is not as generic.
Re: DBMS<=>XML
by jZed (Prior) on May 04, 2004 at 22:09 UTC
    DBD::AnyData uses XML::Twig to create and query XML data sources but it doesn't handle all forms of nested tags so may or may not be of use to you.

    But from the sound of what you've said, what you need is a templating system more than an XML querying system. Here's an example that produces output in the same format as your submission from multiple rows returned from a database query. The script is fully functional except for the missing DBI connection params.

    #!perl -w use strict; use DBI; use HTML::Template; my $query = <<''; SELECT tID,tBID,name FROM taxon WHERE sessionID=1 AND name<>'baz' my $template = <<''; <submission><tmpl_loop results> <taxa tBID="<tmpl_var tBID>"> <taxon tID="<tmpl_var tID>"> <name><tmpl_var name></name> </taxon> </taxa></tmpl_loop> </submission> my $dbh = DBI->connect( ... ); my $results = $dbh->selectall_arrayref($query,{Slice=>{}}); my $tmpl = HTML::Template->new ( scalarref => \$template , die_on_bad_params => 0 ); $tmpl->param( results => $results ); print $tmpl->output; __END__
      Hey, thanks for that. Yup, I think it'll have to be something like this, though this limits me to only certain types of queries.