Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Create MS Word doc in Linux

by bassplayer (Monsignor)
on Mar 06, 2003 at 15:48 UTC ( [id://240912]=perlquestion: print w/replies, xml ) Need Help??

bassplayer has asked for the wisdom of the Perl Monks concerning the following question:

Good Morning Monks,

I am in need of the ability to create a MS Word document from Linux. I have Super Searched and Googled but have not found what I needed. Some of the nodes I found were a bit old, so I thought I would ask if anymonk has found anything new towards this end.

I suppose I could create an RTF instead, but my preference is a true Word doc. For this, I will probably have to do this using Win32::OLE on a Windows box, but I would really rather avoid this for many reasons. I suppose what I am looking for is the Word version of Spreadsheet::WriteExcel. :)

Does anyone have any suggestions for either the Word solution or the RTF solution?

bassplayer

Replies are listed 'Best First'.
Re: Create MS Word doc in Linux
by Mr. Muskrat (Canon) on Mar 06, 2003 at 16:11 UTC
    I think that you should read Why not Word! before continuing your search.
Re: Create MS Word doc in Linux
by rozallin (Curate) on Mar 06, 2003 at 16:18 UTC
    I don't think there is any other alternative than to use either RTF, PDF or HTML instead, as I've never found an app for Linux that will create an MS Word Document. Renaming a file to foo.doc is a bit of a hit and miss, as different versions of MS Word tend to react differently to it, in my experience.

    There is a working group which has been founded to develop an open standard which can interact with OpenOffice software and other applications using XML, but as far as I know they haven't released a working solution.

    --
    Rozallin J. Thompson
    The Webmistress who doesn't hesitate to use strict;

Re: Create MS Word doc in Linux
by poj (Abbot) on Mar 06, 2003 at 17:04 UTC
    For an RTF solution, take a look at RTF::Writer
    I needed to generate a simple document (no tables) but with various font styles. I learnt how to do it by reading the RTF::Cookbook.pod.
    poj
Re: Create MS Word doc in Linux
by swiftone (Curate) on Mar 06, 2003 at 16:00 UTC
    I've never heard of a way to do this outside of Win32, even though I can't see why it would be impossible. I have heard of various ways of "cheating", usually involving creating a .txt or .rtf file, and naming it .doc. Word will (reportedly) convert silently on the fly when opening.

    If no one here answers your question, and you discover an answer, please post it. I'm sure you aren't the first to face the dilemma.

      I am using MsOffice::Word::HTML::Writer and it works fine, but I am currently trying to install it on another server but getting an error message - Failed during this command: DAMI/MsOffice-Word-HTML-Writer-1.03.tar.gz : make_test NO

Re: Create MS Word doc in Linux
by John M. Dlugosz (Monsignor) on Mar 06, 2003 at 16:03 UTC
    I recall a portable development effort for that, but cannot remember the name!

    I know that Open Office (openoffice.org?) will handle MS Word files, and it comes with source code. So check there and see what they're using.

    —John

Re: Create MS Word doc in Linux
by hardburn (Abbot) on Mar 06, 2003 at 16:06 UTC

    Reading Word docs can work in non-MS software much of the time. Writing them reliabily is a much harder trick. Official MS documentation for the Word format is incomplete and often wrong, not to mention the changes that happen in the format for every new version.

    I'd go for RTF or maybe an XML-based solution.

    ----
    Reinvent a rounder wheel.

    Note: All code is untested, unless otherwise stated

Re: Create MS Word doc in Linux
by phydeauxarff (Priest) on Mar 06, 2003 at 18:24 UTC
    Check out the RTF Cookbook at CPAN
    It will get you down the path with document and character formatting.
Re: Create MS Word doc in Linux
by traveler (Parson) on Mar 06, 2003 at 21:19 UTC
    One option might be to save the file as an RTF as has been suggested, then write an OpenOffice macro to open the file and save it as MS-word. If you can get the macro to start without intervention, you would have a solution.

    HTH, --traveler

Re: Create MS Word doc in Linux
by jonadab (Parson) on Mar 09, 2003 at 01:47 UTC

    You say "in Linux", but I assume you mean that you want to create Word documents using Perl, which you happen to be running on Linux.

    Word document format is, errr, complex. Others have said, "make an RTF and convert", but RTF of course lacks most of the features you probably want. However, OpenOffice format (sxw) is very full-featured. Better, it's very straightforward and not hard to generate using Perl. Word can't read them, but OO can do the conversion. This saves you from having to deal directly with Word format as such. (You get to deal with XML, which is MUCH easier.)

    In general, here's the process I use for automatically generating documents from Perl:

    1. Use OpenOffice to create a basic document that contains all the elements you're going to want, but with only token sample information. Get all the formatting just the way you want it: margin settings, fonts, how much space above each type of paragraph, whether to keep it with the next, all that stuff. Save.
    2. Unzip it into a working directory that your Perl script will have access to.
    3. Copy content.xml and paste it into a string (HEREDOC, possibly) in your Perl script. Break the string into three parts: the parts up to and including the body tags, the actual body, and the closing tags at the end.
    4. Replace the actual body of the document with Perl code that generates the body dynamically. Each type of paragraph/table/whatever will have a style associated with it, which refers to the style information in the other files, but all you're changing is the content, presumably. (This is why you only have to change content.xml. If you wanted to dynamically select font sizes and stuff (rather than using the same ones each time you generate a given type of document) then you would have to rewrite one or more of the other files too.)
    5. After the Perl script rewrites content.xml, all it has to do is zip up the working directory to create an .sxw file. I've been using backquotes to call info-zip, but only because I haven't bothered yet to find the zip module that I'm sure exists on CPAN.
    6. If you want Word format, then open the document in OO and Save As. It is probably possible to script this part too, but I haven't done so.

    This could be extended, of course, to automatically generate more than just the text content: it would be trivial to insert images (just copy the image file into your working directory, refer to it in the XML by filename, and zip it right in), but with a little bit of experimentation I'm sure it would not be hard to embed spreadsheets and all sorts of fun stuff. With OO, it's all XML, so automatically generating it from Perl is a breeze. It's not really any harder than writing a CGI script to generate (valid) HTML.

    The only bummer with this approach is that using OO to do the conversion to Word format is a fairly heavyweight thing in terms of system resources. OO has a substantial memory footprint. You wouldn't want to do this on an old Pentium/90 that you've installed Linux on to use as a web server, for example. (You could generate the OO document on there, but you wouldn't want to run OO on there to do the conversion.) 128MB of RAM is recommended, IIRC, for running OO. Also, you don't mention the frequency or speed with which you need to spit out documents. If this is the kind of thing where you're handling web requests and returning a doc to a remote client, then the overhead of OO's load time will be too great. OTOH if you're generating a report that you want to print to give to your boss, you're going to have to load the document in a word processor anyway (to print it), so nothing lost.


    for(unpack("C*",'GGGG?GGGG?O__\?WccW?{GCw?Wcc{?Wcc~?Wcc{?~cc' .'W?')){$j=$_-63;++$a;for$p(0..7){$h[$p][$a]=$j%2;$j/=2}}for$ p(0..7){for$a(1..45){$_=($h[$p-1][$a])?'#':' ';print}print$/}

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://240912]
Approved by rozallin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (8)
As of 2024-03-28 15:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found