ajl52 has asked for the wisdom of the Perl Monks concerning the following question:

My code contains POD to provide first-level documentation (higher-level is in tech documents under text processor).

For instance, I use =over, =item 1 ..., =back to describe arguments to functions.

Sometimes, I insert notes or remarks about what could be otherwise difficult to understand at first sight. My structure is:

Notes: =over =item Text of the note =back

Please note that the =item stuff is empty because, in this case, I don't like the default bold rendering of the "title" item line.

This POD is translated into HTML through pod2html. Browsers display it as intended.

Now I want these HTML pages to look like the rest of the web site, i.e. I want to add headers, fancy titles and footers, plus various links. This is a sugar job for XML::Parser. But it complains about mismatched tags.

I have traced it back to my empty =item line which is translated as a single <dt> without closing </dt> tag.

Translation is correct when =item stuff is not empty.

Is my =item usage forbidden by POD rules?

Can this be corrected in pod2html?

In the meantime, how can I workaround this shortcoming?

Thanks for the tips.

Replies are listed 'Best First'.
Re: POD translation to HTML bug?
by tobyink (Canon) on Dec 05, 2013 at 11:06 UTC

    Pod's "rules" are pretty loose. In general, you can take many liberties, and if things are rendered OK by pod2man and pod2html, you'll get away with it!

    But according to perlpod:

    "And perhaps most importantly, keep the items consistent: either use =item * for all of them, to produce bullets; or use =item 1., =item 2., etc., to produce numbered lists; or use =item foo, =item bar, etc.--namely, things that look nothing like bullets or numbers."

    I would suggest that you want bullets:

    =over =item * Foo =item * Bar =item * Baz =back

    PS: as per the HTML spec, the closing </dt> tag is optional. XML::Parser complains because it's an XML parser (the clue is in the name!), not an HTML parser.

    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

      But pod2html is a misnomer because it outputs XHTML: its output file starts with declaration

      <?xml version="1.0" ?>

      followed by a DOCTYPE referencing explicitly XHTML:

      <!DOCTYPE html ...XHTML... !>

      and self-sufficient tags are legitimately closed with  /> contrary to HTML where some closing tags may be implicit (like </p> or </li>).

      Consequently, one would expect full XHTML compliance from pod2html output.

      When an empty =item is detected, the translation could be <dt /> or, if "void-ness" is discovered later, <dt></dt>.

      The behaviour seems to imply that =item cannot be void.

      The suggested fix has the inconvenient (for me) to add an extra line containing only the bullet, thus augmenting the vertical space occupied by the notes. I could, of course, use an &nbsp; as the title but it ends up also with extra space.

      Maybe the best interim fix would be an extra pass on the HTML file to detect if <dt> is immediately followed by <dd> and insert a </dt>.

      This can be done with simple Perl script.

        Seems that it might be simple to add an auto-close "feature" to pod2html. Not that it should do it, by default. But as an option. Given that the <img tag is already a self contained tag. It might be easy.

        --Chris

        #!/usr/bin/perl -Tw
        use Perl::Always or die;
        my $perl_version = (5.12.5);
        print $perl_version;
Re: POD translation to HTML bug? (pod2html)
by Anonymous Monk on Dec 06, 2013 at 04:50 UTC

    Is my =item usage forbidden by POD rules?

    Doesn't really matter :) but it isn't , see in perlpodspec, perlpod

    Naturally this assumes the invalid-pod you posted is really valid pod in your real file

    Can this be corrected in pod2html?

    Sure, but why bother? Don't hold your breath :) What are you talking about? Maybe you want to upgrade Pod::Html, I have an old 1.1502, I don't get what you get

    There are many alternatives to Pod::Html like Pod::Simple::... Pod::Simple::HTML, Pod::Simple::HTMLBatch, Pod::POM::View::HTML/Pod::POM::Web, Perl::Tidy

    use Perl::Tidy(); my $html = ""; Perl::Tidy::perltidy( source => $modfile, destination => \$html, argv => " -html ", stderr => File::Spec->devnull, ); return $html;

    In the meantime, how can I workaround this shortcoming?

    Well, I would stop using XML::Parser as its very low level. I would use XML::Twig or XML::LibXML

      What am I talking about?

      I tried to use existing command tools to add POD info to a technical web-site. As a consequence, pod2html, a a command line tool, seemed to be the right choice. No programming, no trouble.

      That worked fine until I wanted to integrate these pages into the look-and-feel of the rest of the site. This is when I discovered 1) output of pod2html is XHTML, not HTML, 2) some elements, aka <dt>, are not properly closed in exceptional circumstances.

      What do I want to do?

      Certainly not a full file processing.

      Basically, remove the xml declaration <?xml ... ?>, replace the XHTML DOCTYPE by the HTML DOCTYPE, retrieve important information from <head> block to adapt it to the site rules, add my standard header at the beginning of the <body> block and my standard footer before the </body> tag.

      As can be seen, this does not require a full XML parser.

      Workaround as of today

      I have written a very small Perl script reading a faulty XHTML file and looking for <dt> tags. If a <dd> tag is seen without a previous </dt> tag, the missing tag is inserted right before <dd> tag.

      Just needs a simple state automaton.

      Now, my transformation becomes:

      pod2html perl-file-with-pod.pm | checkdt | adapt-to-site-look -o outpu +tfile.html

      Suggestions

      I'll have a look at XML::Twig and XML::LibXML if basic features XML::Parser give too complex a code.

      Bug fix

      Fixing a bug is always a good thing. Since this one has been exposed to the light, it should be fixed, all the most if it is easy.

      Thanks to all for the information;

        What am I talking about? I tried to ... repeat word description

        Prove it, with example pod, example html output, and versioninformation

        I couldn't reproduce your claims with my pod2html (the version I mentioned), the dt was properly closed, it worked contrary to what you reported ... pretend as if you're making a real bug report :)

        What do I want to do? ....

        I don't remember asking those :D