Re^2: POD translation to HTML bug? (pod2html)

What am I talking about?

I tried to use existing command tools to add POD info to a technical web-site. As a consequence, pod2html, a a command line tool, seemed to be the right choice. No programming, no trouble.

That worked fine until I wanted to integrate these pages into the look-and-feel of the rest of the site. This is when I discovered 1) output of pod2html is XHTML, not HTML, 2) some elements, aka <dt>, are not properly closed in exceptional circumstances.

What do I want to do?

Certainly not a full file processing.

Basically, remove the xml declaration <?xml ... ?>, replace the XHTML DOCTYPE by the HTML DOCTYPE, retrieve important information from <head> block to adapt it to the site rules, add my standard header at the beginning of the <body> block and my standard footer before the </body> tag.

As can be seen, this does not require a full XML parser.

Workaround as of today

I have written a very small Perl script reading a faulty XHTML file and looking for <dt> tags. If a <dd> tag is seen without a previous </dt> tag, the missing tag is inserted right before <dd> tag.

Just needs a simple state automaton.

Now, my transformation becomes:

pod2html perl-file-with-pod.pm | checkdt | adapt-to-site-look -o outpu
+tfile.html
[download]

Suggestions

I'll have a look at XML::Twig and XML::LibXML if basic features XML::Parser give too complex a code.

Bug fix

Fixing a bug is always a good thing. Since this one has been exposed to the light, it should be fixed, all the most if it is easy.

Thanks to all for the information;

Comment on Re^2: POD translation to HTML bug? (pod2html) Select or Download Code

Replies are listed 'Best First'.
Re^3: POD translation to HTML bug? (pod2html) by Anonymous Monk on Dec 06, 2013 at 11:35 UTC
What am I talking about? I tried to ... repeat word description Prove it, with example pod, example html output, and versioninformation I couldn't reproduce your claims with my pod2html (the version I mentioned), the dt was properly closed, it worked contrary to what you reported ... pretend as if you're making a real bug report :) What do I want to do? .... I don't remember asking those :D	[reply]
Re^4: POD translation to HTML bug? (pod2html) by ajl52 (Novice) on Dec 07, 2013 at 14:19 UTC
No offence was intended. It was kind of conclusion/closing of the thread. However, you puzzled me with the non-reproduced behaviour. Consequently, I wrote a small sample case: `=pod =over =item Sample paragraph =item Non-empty Second sample =back =cut 1;` [download] Empty lines suppressed to make it smaller. I then run `pod2html --infile=podbug.pm --outfile=podbug.html` and the result is: <?xml version="1.0" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w +3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>podbug.pm</title> <meta http-equiv="content-type" content="text/html; charset=utf-8" /> <link rev="made" href="mailto:root@localhost" /> </head> <body style="background-color: white"> <!-- INDEX BEGIN --> <div name="index"> <p><a name="__index__"></a></p> </div> <!-- INDEX END --> <dl> <dt> <dd> <p>Sample paragraph</p> </dd> <dt><strong><a name="non_empty" class="item">Non-empty</a></strong></d +t> <dd> <p>Second sample</p> </dd> </dl> </body> </html> [download] From this output: It is clearly XHTML, not HTML, from the declaration in line 1 and DOCTYPE in line 2 Empty `=item` is translated by a single <dt> without closing </dt>, which is legal in HTML but not in XHTML Non-empty `=item` is correctly translated Version information I don't know into which package pod2html is stored. My present Perl installation is v5.14.4 (should be upgraded within a month). I found Pod::Simple::HTML in the library and it claims version 3.16. I'm no Perl guru, but from a quick look to this package, I doubt it is used by pod2html because the DOCTYPE in it is for HTML 4.01 Transitional Regards PS: If you need more version information, tell me how to find it. PPS: It would be nicer to attach the test files but I don't know how to do it.	[reply] [d/l] [select]
Re^5: POD translation to HTML bug? (pod2html Pod::Html 1.11) by Anonymous Monk on Dec 07, 2013 at 23:19 UTC
No offence was intended. Great, none was taken :) ... =pod ... Well, that isn't valid pod, just like I mentioned earlier `$ podchecker junk.pod *** ERROR: =pod directives shouldn't be over one line long! Ignoring +all 7 lines of content at line 1 in file junk.pod junk.pod does not contain any pod commands.` [download] To make it valid pod write Read more... (295 Bytes) So you'll get (after running through xml_pp) Read more... (1164 Bytes) so its xhtml, its all balanced and proper nesting ... sure the id is invalid but thats no big deal :) $ perldoc pod2html \|ack :: See Pod::Html for a list of known bugs in the translator. perlpod, Pod::Html $ mversion Pod::Html 1.1502 Read more... (636 Bytes) So Pod::Html v1.11 is about 3 years and 60 commits ago, before it used Pod::Simple So the pod2html you're complaining is really old and unsupported from a version of perl that is not supported anymore :)	[reply] [d/l] [select]