Re: Convert .doc to .pdf
by Corion (Patriarch) on Jan 31, 2007 at 17:58 UTC
|
The easiest way is to buy MS Office and Windows, install both in a virtual machine and use that. A comparable way is likely to buy Adobe Acrobat or Adobe Distiller to convert your Word documents to PDF. Maybe Adobe doesn't insist on Windows.
A different way might be to try to automate OpenOffice as it as import filters for Word and export filters for PDF. Unfortunately, OpenOffice is bad to automate unless you like Java and the object model that Java tends to impose.
The most ugly but in the long term most beneficial approach would be to extract the import and export filters from OpenOffice and turn them into Perl extensions or at least command line programs to en- or decode as you want.
| [reply] |
|
|
Unfortunately, OpenOffice is bad to automate unless you like Java and the object model that Java tends to impose.
You can also use Python to drive OpenOffice (not that that's much better than Java, mind you . . . :). I don't recall where I found the sample code I based what I wrote (a converter which munged SXC XML files (which had been run through Template Toolkit) into Excel XLS files), but the Python page in the OO wiki may get you started.
Update: Aaah, found the links to more examples: http://udk.openoffice.org/python/python-bridge.html
| [reply] |
Re: Convert .doc to .pdf
by BrowserUk (Patriarch) on Jan 31, 2007 at 18:24 UTC
|
| [reply] |
Re: Convert .doc to .pdf
by klekker (Pilgrim) on Jan 31, 2007 at 19:35 UTC
|
| [reply] |
Re: Convert .doc to .pdf
by ww (Archbishop) on Jan 31, 2007 at 18:35 UTC
|
Or, perhaps, you might elaborate step 4):
a. Open word doc with OpenOffice (does the job very nicely)
b) Tell OO to save as .pdf (in some appropriate place)
continue with step 5)
UPDATE: Missed Corion's and Fletchs mention/deprecation of this idea, but OO2.x both reads Word .docs reliably and has option for .pdf output... which might even be worth the pain of using java to achieve the the minimal automation required in 4b and 5 ... if, in fact, there's no public API to interface with Perl.
...and believe me, it pains me to say that.
<;-) | [reply] |
|
|
I have done this at $work, unfortunately, I am unable to release it outside of my cubicle walls. But I will say that it is based heavily on the code sample found at http://www.codeproject.com/office/PortableOpenOffice.asp, and hooked into an Apache server via a CGI call. Oh, just so it applies to PerlMonks, the CGI wrapper is a Perl script that does some pre and post processing on the file validation testing, meta-data fillin, etc.
| [reply] |
|
|
Very nice.
I just tried this with OO 2.0.2
Built the macro, as described in the article you linked to.
Since I was not interested in a CGI wrapper, the only interesting stuff in the the ASP/C# stuff in the download acompanying the article is how to call it (Win here):
path_to_OO_executables\swriter.exe macro:///ConversionLibrary.PDFConve
+rsion.ConvertWordToPDF(Word.doc,Output.pdf)
(I used the same names as in the article)
Assemble that command line dynamically with the file names needed, run it as a background process, and you are done.
Worked fine, except for an "ErrorCodeIOException" occuring at the export call...
Hm, took me a few minutes to realise, that it was not my fault, but a known bug in V2.0.2. ;-/
| [reply] [d/l] |
|
|
Re: Convert .doc to .pdf
by dragonchild (Archbishop) on Jan 31, 2007 at 19:08 UTC
|
The biggest issue in all of this is parsing the .doc - if you can do that, then you can create a PS file and use ps2pdf to finish it off. So, you really want to be looking for something that parses doc -> ps. And, given that MS has been very very tight with the .doc format, it may not be doable.
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |
Re: Convert .doc to .pdf
by glasswalk3r (Friar) on Feb 01, 2007 at 12:42 UTC
|
Maybe you could use OpenOffice and it's internal macro language to do the convertion job.
Alceu Rodrigues de Freitas Junior
---------------------------------
"You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill
| [reply] |
|
|
Uh, rather than write code, how about you tell openoffice to write the document to the PDF printer, like:
/usr/local/OOffice1.1.5/soffice -pt "PDF" somefile.doc
The "-pt" says to print this document to the specified printer.
| [reply] |
Re: Convert .doc to .pdf
by sgt (Deacon) on Jan 31, 2007 at 20:29 UTC
|
| [reply] |