in reply to Re: POD translation to HTML bug? (pod2html)
in thread POD translation to HTML bug?

What am I talking about?

I tried to use existing command tools to add POD info to a technical web-site. As a consequence, pod2html, a a command line tool, seemed to be the right choice. No programming, no trouble.

That worked fine until I wanted to integrate these pages into the look-and-feel of the rest of the site. This is when I discovered 1) output of pod2html is XHTML, not HTML, 2) some elements, aka <dt>, are not properly closed in exceptional circumstances.

What do I want to do?

Certainly not a full file processing.

Basically, remove the xml declaration <?xml ... ?>, replace the XHTML DOCTYPE by the HTML DOCTYPE, retrieve important information from <head> block to adapt it to the site rules, add my standard header at the beginning of the <body> block and my standard footer before the </body> tag.

As can be seen, this does not require a full XML parser.

Workaround as of today

I have written a very small Perl script reading a faulty XHTML file and looking for <dt> tags. If a <dd> tag is seen without a previous </dt> tag, the missing tag is inserted right before <dd> tag.

Just needs a simple state automaton.

Now, my transformation becomes:

pod2html perl-file-with-pod.pm | checkdt | adapt-to-site-look -o outpu +tfile.html

Suggestions

I'll have a look at XML::Twig and XML::LibXML if basic features XML::Parser give too complex a code.

Bug fix

Fixing a bug is always a good thing. Since this one has been exposed to the light, it should be fixed, all the most if it is easy.

Thanks to all for the information;

Replies are listed 'Best First'.
Re^3: POD translation to HTML bug? (pod2html)
by Anonymous Monk on Dec 06, 2013 at 11:35 UTC

    What am I talking about? I tried to ... repeat word description

    Prove it, with example pod, example html output, and versioninformation

    I couldn't reproduce your claims with my pod2html (the version I mentioned), the dt was properly closed, it worked contrary to what you reported ... pretend as if you're making a real bug report :)

    What do I want to do? ....

    I don't remember asking those :D

      No offence was intended. It was kind of conclusion/closing of the thread.

      However, you puzzled me with the non-reproduced behaviour. Consequently, I wrote a small sample case:

      =pod =over =item Sample paragraph =item Non-empty Second sample =back =cut 1;

      Empty lines suppressed to make it smaller.

      I then run pod2html --infile=podbug.pm --outfile=podbug.html and the result is:

      <?xml version="1.0" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w +3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>podbug.pm</title> <meta http-equiv="content-type" content="text/html; charset=utf-8" /> <link rev="made" href="mailto:root@localhost" /> </head> <body style="background-color: white"> <!-- INDEX BEGIN --> <div name="index"> <p><a name="__index__"></a></p> </div> <!-- INDEX END --> <dl> <dt> <dd> <p>Sample paragraph</p> </dd> <dt><strong><a name="non_empty" class="item">Non-empty</a></strong></d +t> <dd> <p>Second sample</p> </dd> </dl> </body> </html>

      From this output:

      1. It is clearly XHTML, not HTML, from the declaration in line 1 and DOCTYPE in line 2
      2. Empty =item is translated by a single <dt> without closing </dt>, which is legal in HTML but not in XHTML
      3. Non-empty =item is correctly translated

      Version information

      I don't know into which package pod2html is stored. My present Perl installation is v5.14.4 (should be upgraded within a month). I found Pod::Simple::HTML in the library and it claims version 3.16.

      I'm no Perl guru, but from a quick look to this package, I doubt it is used by pod2html because the DOCTYPE in it is for HTML 4.01 Transitional

      Regards

      PS: If you need more version information, tell me how to find it.

      PPS: It would be nicer to attach the test files but I don't know how to do it.

        No offence was intended.

        Great, none was taken :)

        ... =pod ...

        Well, that isn't valid pod, just like I mentioned earlier

        $ podchecker junk.pod *** ERROR: =pod directives shouldn't be over one line long! Ignoring +all 7 lines of content at line 1 in file junk.pod junk.pod does not contain any pod commands.

        To make it valid pod write

        So you'll get (after running through xml_pp)

        so its xhtml, its all balanced and proper nesting ... sure the id is invalid but thats no big deal :)

        $ perldoc pod2html |ack ::
            See Pod::Html for a list of known bugs in the translator.
            perlpod, Pod::Html
        
        $ mversion Pod::Html
        1.1502

        So Pod::Html v1.11 is about 3 years and 60 commits ago, before it used Pod::Simple

        So the pod2html you're complaining is really old and unsupported from a version of perl that is not supported anymore

        :)