perlquestion
hacker
<p align="justify">I have an SGML FAQ for one of my projects, which was originally a static text file. In the interests of keeping the FAQ updated and maintainable, I've converted it to SGML, and it resides in our CVS.
<p align="justify">I'd like to be able to take [http://cvs.plkr.org/index.cgi/*checkout*/docs/FAQ.sgml?content-type=text/plain|the FAQ], through [http://www.perldoc.com/perl5.6.1/lib/LWP.html|LWP] via either of the [http://chora.plkr.org/|two] [http://viewcvs.sourceforge.net|ViewCVS] [http://cvs.plkr.org/|interfaces] (which is how it was done when it was [http://cvs.plkr.org/index.cgi/*checkout*//FAQ?content-type=text/plain|static text]), convert it to [http://www.w3.org/TR/xhtml1/|XHTML], and display it on the webpage, dynamically. <p align="justify">Each time a user clicks on the 'FAQ' link on the website, the FAQ will be queried from CVS, converted, wrapped in validated XHTML, and thrown at their browser.
<p>The FAQ generally looks [http://cvs.plkr.org/index.cgi/*checkout*/docs/FAQ.sgml?content-type=text/plain|like this]:<blockquote><code><!-- ####################### -->
<!-- Section 2: Installation -->
<!-- ####################### -->
<sect1 id="whatplatforms">
<title>What platforms does Plucker run on?</title>
<para>
The viewer should run on any Palm OS device utilizing
version 2.0.4 or higher of Palm OS, while the desktop
tools are supported on Linux, Windows, Mac OS X, and
OS/2.
</para>
<para>
The desktop tools will probably work on any Unix system
with Python installed, but your mileage may vary, so
don't get angry if they don't work. If you are able to
get it running on a system not listed in REQUIREMENTS
then please let us know so that it can be added to the
list of supported systems.
</para>
</sect1>
<!-- #################################### -->
<!-- Section 2: Installation: END -->
<!-- #################################### -->
</code></blockquote>
<p align="justify">There are some sections with <itemizedlist> and <listitem> tags in them, so those must be parsed as well. I'd rather stay away from [http://www.perldoc.com/perl5.6.1/lib/HTML/Template.html|HTML::Template] for this particular venture, but I will be moving that direction soon. Right now, 'sub faq {...}' is where the code is taking place.
<p align="justify">Has anyone done this? Are there secrets to it? I've read that converting the SGML output to XML first is one way to go. I've looked at [SGML::Parser] and [http://www.perldoc.com/perl5.6.1/lib/HTML/Parser.html|HTML::Parser], but I'm not sure they can help, without a lot of hand rolling around in the FAQ and parsing out elements into hashes myself.
<p align="justify">One other idea I had was to parse it from SGML into a [http://www.mysql.com/|MySQL] database, and just keep the text of the FAQ in there. The only problem with this is that the file itself wouldn't be in CVS, available for checkout.. though I could add a pre-checkout command to that, which does the query and writes the data to the SGML file from mysql (slowly getting off-topic here).
<p>What's the best approach here?