What am I talking about?

I tried to use existing command tools to add POD info to a technical web-site. As a consequence, pod2html, a a command line tool, seemed to be the right choice. No programming, no trouble.

That worked fine until I wanted to integrate these pages into the look-and-feel of the rest of the site. This is when I discovered 1) output of pod2html is XHTML, not HTML, 2) some elements, aka <dt>, are not properly closed in exceptional circumstances.

What do I want to do?

Certainly not a full file processing.

Basically, remove the xml declaration <?xml ... ?>, replace the XHTML DOCTYPE by the HTML DOCTYPE, retrieve important information from <head> block to adapt it to the site rules, add my standard header at the beginning of the <body> block and my standard footer before the </body> tag.

As can be seen, this does not require a full XML parser.

Workaround as of today

I have written a very small Perl script reading a faulty XHTML file and looking for <dt> tags. If a <dd> tag is seen without a previous </dt> tag, the missing tag is inserted right before <dd> tag.

Just needs a simple state automaton.

Now, my transformation becomes:

pod2html perl-file-with-pod.pm | checkdt | adapt-to-site-look -o outpu +tfile.html

Suggestions

I'll have a look at XML::Twig and XML::LibXML if basic features XML::Parser give too complex a code.

Bug fix

Fixing a bug is always a good thing. Since this one has been exposed to the light, it should be fixed, all the most if it is easy.

Thanks to all for the information;


In reply to Re^2: POD translation to HTML bug? (pod2html) by ajl52
in thread POD translation to HTML bug? by ajl52

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.