http://qs1969.pair.com?node_id=181206

XML::Generator::DBI is the replacement for DBIx::XML_RDB, which allowed you to convert a database query into 'on the fly' XML. XML::Generator::DBI does a lot more than just that, it generates SAX events and relies upon a SAX handler to handle them in some manner. This adds more complexity to the process, but it also adds more flexibility, allowing you to do more than simply create a string or XML file. This tutorial aims to showcase the features of XML::Generator::DBI while providing concrete examples of sample usage.

You can download each example individually, or download them all via tar ball


Table of Contents


Example 1: DBI to XML

[download example 1]

Say you want to simply convert a database query into an XML document and your database table, foo, looks like this:

  +-------+------+
  | bar   | baz  |
  +-------+------+
  | one   |    1 |
  | two   |    2 |
  | three |    3 |
  +-------+------+

The following code will convert the results of the SQL statement 'select bar,baz from foo' into an XML document, foo.xml.

First we use the appropriate modules. XML::Generator::DBI needs a SAX handler to create the file, a handler such as XML::Handler::YAWriter:

use strict; use DBI; use XML::Generator::DBI; use XML::Handler::YAWriter;

Next we connect to the database, you will need to replace vendor, database, host, user, and pass; with your credentials:

my $dbh = DBI->connect( qw(DBI:vendor:database:host user pass), {RaiseError=>1}, );

Next we instantiate the SAX handler with the AsFile argument set the name of the file we wish to create:

my $handler = XML::Handler::YAWriter->new(AsFile => 'foo.xml');

Next we instantiate the generator. We pass our database handle and our SAX handler to the constructor, and specify that we want indentation turned on via the Indent argument (if you know that humans will not need to see the XML, then save some bytes and don't indent):

my $generator = XML::Generator::DBI->new( Handler => $handler, dbh => $dbh, Indent => 1, );

Finally, we execute our SQL query with the generator's execute() method:

$generator->execute('select bar,baz from foo');

And that's it. Providing that we properly connected to the database, had no errors in our SQL statement, and results were actually returned from the query, the file foo.xml will be created and it will contain the query results wrapped in XML elements. Here is what my results looked like:

<?xml version="1.0" encoding="UTF-8"?><database>
 <select query="select bar,baz from foo">
  <row>
   <bar>one</bar>
   <baz>1</baz>
  </row>
  <row>
   <bar>two</bar>
   <baz>2</baz>
  </row>
  <row>
   <bar>three</bar>
   <baz>3</baz>
  </row>
 </select>
</database>

There is a lot of magic going on behind the scenes. Here is an illustration of the pipeline through which the data flows and is transformed:

  XML::Generator::DBI
   -> XML::Handler::YAWriter

XML::Generator::DBI fetches the query results from the database and generates SAX events. These SAX events are handled by XML::Handler::YAWriter, which writes them out to the appropriate destination as XML elements.


Example 2: Working with XPath

[download example 2]

A powerful feature of DBI::Generator::XML is it's ability to work with any SAX Level 1 handler. Instead of writing to file, you can instead pass the XML to a handler such as XML::XPath's XML::XPath::Builder for further processing. Let's give it a try, first we import the necessary modules:

use strict; use DBI; use XML::Generator::DBI; use XML::XPath; use XML::XPath::Builder;

Next, we connect to the database and fetch a handle:

my $dbh = DBI->connect( qw(DBI:vendor:database:host user pass), {RaiseError=>1}, );

Next, we instantiate our SAX handler:

my $handler = XML::XPath::Builder->new();

Next, we instatiate the generator and pass it the handler (and notice that Indenting is not turned on - XML::XPath::Builder does not need it):

my $generator = XML::Generator::DBI->new( Handler => $handler, dbh => $dbh, );

Next, we call the execute() method from the generator - just like Example 1, but this time we actually receive an l-value from the generator() method, an XML::XPath::Node::Element object:

my $xp = $generator->execute('select bar,baz from foo');

Finally, we use this object to obtain a list of the nodes we want (consult the docs for XML::XPath and XML::XPath::Builder for more info on those modules):

my $nodeset = $xp->find('/database/select/row/bar'); print $_->string_value, "\n" for $nodeset->get_nodelist;

This will yield the following output (to STDOUT):

one
two
three

Our pipeline for this example looks like:

  XML::Generator::DBI
   -> XML::XPath::Builder

Now, of course we could have achieved roughly the same results with a simple database query, but the point of this example is to show that XML::Generator::DBI will work with any SAX1 handler. It even works with SAX2 handlers via XML::Filter::SAX1toSAX2, as we shall see in the next example.


Example 3: DBI to XML to HTML

[download example 3]

Let's try a SAX2 handler such as XML::Handler::HTMLWriter, which transforms SAX events into HTML elements. In order for XML::Generator::DBI's SAX1 events to be processed by XML::Handler::HTMLWriter, they will have to first be converted to SAX2 events. This is accomplished with XML::Filter::SAX1toSAX2. Let's take a look at the pipeline first:

  XML::Generator::DBI
   -> XML::Filter::SAX1toSAX2
     -> XML::Handler::HTMLWriter

In a moment, you will see that we instantiate the XML::Handler::HTMLWriter object, attach it to an XML::Filter::SAX1toSAX2 object, and pass the filter to XML::Generator::DBI. This chain of events might give the incorrect impression that XML::Handler::HTMLWriter is SAX1 and XML::Generator::DBI is SAX2 (it had me confused at first). Visualize the flow of events like the pipeline illustrates, not in the order that the objects are instantiated.

First, we import the necessary modules, and connect to the database:

use strict; use DBI; use XML::Generator::DBI; use XML::Filter::SAX1toSAX2; use XML::Handler::HTMLWriter; my $dbh = DBI->connect( qw(DBI:vendor:database:host user pass), {RaiseError=>1}, );

Next, we instatiate an XML::Handler::HTMLWriter handler, which outputs to STDOUT by default:

my $handler = XML::Handler::HTMLWriter->new();

Next, we will need to filter this handler:

my $filter = XML::Filter::SAX1toSAX2->new(Handler => $handler);

Next, a trick - XML::Handler::HTMLWriter expects the first element it encounters to be <html>, but XML::Generator::DBI's first element is <database>. No problem - we can override the name of the first element with the RootElement argument. The next pitfall is that instead of our tabular XML data being wrapped in <table> elements, they are wrapped in <select> elements - we can override this with the QueryElement argument. Yet another pitfall is that rows are wrapped in <row> elements, not <tr> elements - this is remedied by overriing the RowElement argument. And finally, the last pitfall is that the elements that make up each column name are the column names, not <td>. XML::Generator::DBI does have a ColumnElement, but we can't touch it. We can, however, change the name of each column name within SQL, via the AS keyword. Yes, this really works:

my $generator = XML::Generator::DBI->new( Handler => $filter, dbh => $dbh, RootElement => 'html', QueryElement => 'table', RowElement => 'tr', Indent => 1, ); $generator->execute('select foo as td,bar as td from baz');

And the results are:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
 <table query="select foo as td,bar as td from baz">
  <tr>
   <td>foo</td>
   <td>1</td>
  </tr>
  <tr>
   <td>bar</td>
   <td>2</td>
  </tr>
  <tr>
   <td>baz</td>
   <td>3</td>
  </tr>
 </table>
</html>

While this works, it is no doubt not the correct procedure - there is no 'query' attribute for a <table> element. Also, it would be nice to show the names of each column as table headings (<th> tags). There has to be a better way ...


Example 4: DBI to XML to HTML (revisited)

[download example 4, download foo.xsl]

This time, we utilize the power of XSLT to transform an XML document to an HTML document. Discussing XSLT is beyond the scope of this review, but in a nutshell, XSL is a language for expressing style sheets (XSLT stands for XSL Transformations), and just so happens that it does a good job of generating HTML from XML. You can read more about XSL and XSLT at http://www.w3.org/Style/XSL.

The SAX Level 2 XML::Filter::XSLT module can be used with XML::SAX::Writer to apply a stylesheet to the XML content. Our pipeline looks like this:

  XML::Generator::DBI
   -> XML::Filter::SAX1toSAX2
     -> XML::Filter::XSLT
       -> XML::SAX::Writer

Here we go. First, import the modules and connect to the database:

use strict; use DBI; use XML::Generator::DBI; use XML::Filter::SAX1toSAX2; use XML::Filter::XSLT; use XML::SAX::Writer; my $dbh = DBI->connect( qw(DBI:vendor:database:host user pass), {RaiseError=>1}, );

Next, create the pipeline:

my $writer = XML::SAX::Writer->new(); my $xsl_filt = XML::Filter::XSLT->new(Handler => $writer); my $sax_filt = XML::Filter::SAX1toSAX2->new(Handler => $xsl_filt); my $generator = XML::Generator::DBI->new( Handler => $sax_filt, dbh => $dbh, );

Because XML::SAX::Writer is at the end of the chain, it will override XML::Generator::DBI's indenting, so there is no need to turn it on.

Finally, supply the stylesheet (given below) to the XML::Filter::XSLT object and generate the output:

$xsl_filt->set_stylesheet_uri('foo.xsl'); $generator->execute('select bar,baz from foo');

XML::SAX::Writer currently does not indent it's output, so i piped it through a Perl one-liner:

./example4.pl | perl -pe "s/></>\n</g"
<?xml version='1.0'?>
<html>
<body>
<table>
<tr>
<th>Bar</th>
<th>Baz</th>
</tr>
<tr>
<td>one</td>
<td>1</td>
</tr>
<tr>
<td>two</td>
<td>2</td>
</tr>
<tr>
<td>three</td>
<td>3</td>
</tr>
</table>
</body>
</html>

And finally, here is the XSL stylesheet - save this as the file foo.xsl:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<table>
  <tr>
    <th>Bar</th>
    <th>Baz</th>
  </tr>
  <xsl:for-each select="database/select/row">
    <tr>
      <td><xsl:value-of select="bar"/></td>
      <td><xsl:value-of select="baz"/></td>
    </tr>
  </xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Example 5: A Dynamic CGI Script

[download example 5, download dynamic.xsl]

Example 4 presented XSLT, but did so in fairly inflexible manner. The stylesheet used 'hard-coded' values: not only did we hard code the names of each row, we also hard coded the table headings. What if we don't know ahead of time what the names of the database columns will be? Is there a way abstract this information? You bet - it's called XPath, which was used in the stylesheet from example 4. But this time we use the power of XPath's * wildcard and text() function to accept any database column name. In order to provide dynamic column headers, we will utilize the XML::Generator::DBI constructor's ShowColumns argument.

Let's first take a peak at what the XML looks like for our sample database when we specify the ShowColumns argument:

<?xml version="1.0" encoding="UTF-8"?><database>
 <select query="select bar,baz from foo">
  <columns>
   <column>
    <name>bar</name>
    <type>varchar</type>
    <size>255</size>
    <precision>32</precision>
    <scale>0</scale>
    <nullable>NULL</nullable>
   </column>
   <column>
    <name>baz</name>
    <type>integer</type>
    <size>10</size>
    <precision>8</precision>
    <scale>0</scale>
    <nullable>NULL</nullable>
   </column>
  </columns>
  <row>
   <bar>one</bar>
   <baz>1</baz>
  </row>
  .....  etc.

With this information we can dynamically build our column headings for our table. And here is the new stylesheet, dynamic.xsl. It very similar to the last one, but will work with any SQL select query:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<table>
  <tr>

This time, instead of hard coding the field names we loop through the <name> elements provided by XML::Generator::DBI. We obtain the contents of those elements with the text() function:

  <xsl:for-each select="database/select/columns/column/name">
    <th><xsl:value-of select="text()"/></th>
  </xsl:for-each>
  </tr>
  <xsl:for-each select="database/select/row">
    <tr>

Here, instead of relying upon hard-coded field names, we utiltize the * wildcard to match any element, and obtain it's content again with text() function:

        <xsl:for-each select="*">
      <td><xsl:value-of select="text()"/></td>
        </xsl:for-each>
    </tr>
  </xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

For our CGI script we will offer the end user a checkbox group of columns from a database table we designate. We could let them chose the table and the columns, but this way is more secure. We will still have to make sure that column names the end user supplies provides are valid, as a web bot does not need to use our interface to interact with the script. I chose to use my MP3 collection database, feel free to use whatever database you desire. First we import the necessary modules:

use strict; use DBI; use CGI qw(:standard); use CGI::Carp qw(fatalsToBrowser); use XML::SAX::Writer; use XML::Generator::DBI; use XML::Filter::SAX1toSAX2; use XML::Filter::XSLT;

Next, we specify the name of our stylesheet and the columns we will allow the end user to select:

my $style = 'dynamic.xsl'; my %field = ( name => 'artist.name as Artist', album => 'album.title as Album', title => 'song.title as Song', year => 'album.year as Year', );

Next, we print the header and our form:

print header, start_html, start_form, checkbox_group( -name => 'fields', -values => [ keys %field ], ), submit('go'), end_form, ;

Next, we check to see if the user has submitted the form:

if (param('go')) {

The next line only accepts those column names that we are allowing the user to select. If the user doesn't select any or provides invalid column names, then the array @ok will be empty:

my @ok = map { $field{$_} } grep $field{$_}, param('fields');

If @ok is empty, we issue a die, which will be intercepted by CGI::Carp. Otherwise we continue by joining the valid column names with commas:

die 'no valid fields selected' unless @ok; my $select = join(',', @ok);

Connect to the database. This time i specify the database mp3. Your milleage will vary:

my $dbh = DBI->connect( qw(DBI:vendor:mp3:host user pass), {RaiseError=>1}, );

Next, create the pipeline - notice that ShowColumns is turned on:

my $writer = XML::SAX::Writer->new(); my $xsl_filt = XML::Filter::XSLT->new(Handler => $writer); my $sax_filt = XML::Filter::SAX1toSAX2->new(Handler => $xsl_filt); my $generator = XML::Generator::DBI->new( Handler => $sax_filt, dbh => $dbh, ShowColumns => 1, );

Next, check that the stylesheet exists and is readable by the web server and die if it is not. If all is well, process the stylesheet:

die "could not open file $style" unless -r $style; $xsl_filt->set_stylesheet_uri($style);

Finally, execute our SQL statement and send the results to the browser:

$generator->execute(" select $select from song inner join album on song.album_id=album.id inner join artist on album.artist_id=artist.id order by artist.name,album.year,album.title "); }

Credits

Big thanks to author of XML::Generator::DBI, Matt Sergeant (Matts). Without his help, guidance, corrections and patches this tutorial could not have been written. Thanks once again Matt. :)


Bugs

While working with example 5, i discovered that CGI::Carp did not play well with XML::SAX::Writer. I found that i had older versions of both modules, so i installed the latest versions for both. This fixed the CGI::Carp issue, but it also broke example 3 - XML::Handler::HTMLWriter inherits from XML::SAX::Writer and tries to call XML::SAX::Writer's start_document() and end_document() methods, which no longer exist in v0.41. These methods do exist in v0.39, however, so i installed XML::SAX::Writer v0.39 and both examples 3 and 5 worked for me. This leads me to suspect that you will need to install the latest CGI::Carp in order for example 5 to work with it.

Also, example 5 has a couple of known bugs: an XML header is printed when one is not needed. Also, opening and close <html> and <body> tags are generated when they too are not needed. Either i am missing an API call or two, or the funtionality to omit these has not yet been implemented. While most (if not all) browsers will accept this, it surely is not valid XHTML. Also, i do not know of a way to attach a Cascading Style Sheet to the final example - this would be nice. Please /msg jeffa or post a reply if you have any information about these issues.

(Hopefully, this entire 'Bugs' secion will disappear soon.)


TODO

Do you have a favorite SAX handler that you would like to include? Then this is your chance to add to this tutorial. Feel free to post additional examples with XML::Generator::DBI.

Replies are listed 'Best First'.
Re: XML::Generator::DBI Tutorial
by chanio (Priest) on Apr 18, 2006 at 00:21 UTC
    Thank you for this 'XML in perl show'. It is even more useful as a working example of integrating xml-sax in perl than as it's main purpose in this tutorial.

    In order to preserve coherence with your first DBI example: at your third example you should correct the following:

    $generator->execute('select foo as td,bar as td from baz');
    Should be...
    $generator->execute('select bar as td,baz as td from foo');
    ...and the output should read:
    <table query="select bar as td,bas as td from foo"> <tr> <td>one</td> <td>1</td> </tr> <tr> <td>two</td> <td>2</td> </tr> <tr> <td>three</td> <td>3</td> </tr> </table>
    intead of...
    <table query="select foo as td,bar as td from baz"> <tr> <td>foo</td> <td>1</td> </tr> <tr> <td>bar</td> <td>2</td> </tr> <tr> <td>baz</td> <td>3</td> </tr> </table>
    Am I wrong?

    And, finally, if I understand your worries...

    ... Also, example 5 has a couple of known bugs: ... Also, opening and close <html> and <body> tags are generated when they too are not needed...
    Should happen because your previous dynamic.xsl stylesheet has the <html> and <body> tags mentioned...

    I hope that I have really :) contributed with this extraordinary tutorial. Thank you a lot!

      Simple script to convert CSV file into XML file http://exceptiongeek.com/articles/article/6/Convert-CSV-file-into-XML-file-in-Perl