QM has asked for the wisdom of the Perl Monks concerning the following question:

I tried to install PDF::FromHTML, but ran into problems (Win XP). I had to force the install, including Graphics::Colors. Then HTML::Tidy wouldn't force, and I resorted to XML::Clean.

Running the script html2pdf.pl on something simple generated parsing errors and no output.

Am I on the right path? Or is there something better, and I should stop wasting time on P::F?

(Pardon me for not searching here, but Super Search keeps coming back with "Query was not run; Server is too busy ("explain" took 11.70 seconds)".)

-QM
--
Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re: HTML to PDF?
by Khen1950fx (Canon) on May 05, 2007 at 05:54 UTC
    I think that you're getting parsing errors and no output because PDF::FromHTML doesn't "utilise CSS"---it won't convert CSS and wasn't designed for CSS. Check the CAVEATS section, line 205 in FromHTML.pm for more info. I ran html2pdf.pl on 10 files and got nothing back. All of my HTML files have CSS in there. I'd dump PDF::FromHTML and use HTML::HTMLDoc.

    update: HTML::HTMLDoc uses HTMLDOC. HTMLDOC 1.8.x doesn't support CSS; however, 1.9.x supports CSS1 and CSS2.

Re: HTML to PDF?
by clinton (Priest) on May 05, 2007 at 10:12 UTC
    I've just tried out HTML::HTMLDoc and (the current version) doesn't support much in the way of CSS.

    The problem is, of course, that in order to format a web page correctly for conversion to PDF, you have to build what amounts to a web browser. So my thoughts were, well, why not use Firefox to do that?

    You'd think that would be easy, surely just a command line option that would use the existing code to print to a PDF, rather than launching a browser window. Naturally, I'm not the first person to think of that. However, this feature request, which is not the first, was opened in 2001, and seems to be well and truly stalled.

    In that bug, they do mention gnome-web-photo, which I realise won't be of much use to you, as it is for Linux, but I tried it out, and it works rather well. It is available as a package on openSuSE at least, I presume on other Linux distributions as well.

    It produces PS files, which are easy to convert to PDF docs with ps2pdf.

      Thanks for your comments.

      The primary application is on Windows, though Unix and Linux are available remotely when needed. I'm trying to help a colleague, so I'll pass these ideas along.

      The Firefox idea is a good one. Perhaps someone in some remote corner of the Net has done something like this, and I just haven't searched thoroughly enough.

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

Re: HTML to PDF?
by fmerges (Chaplain) on May 05, 2007 at 07:59 UTC

    Hi,

    There's also HTMLDOC is really powerful and easy.

    Regards,

    fmerges at irc.freenode.net