in reply to Re: changing the file extension
in thread changing the file extension

There are tools out there that allow you to convert the Word format into something else, and possibly something you can later convert into text/plain. antiword is one of them and does a pretty good job.

Unfortunately, I don't know of any free tool that can handle the bewildering complexity of a document laden with tables, pictures, OLE objects, bells, whistles, whatever. And what will become of your converter when they "upgrade" to MS Word 200x?

The safest would be to have access to an instance of Word itself and drive it from your perl program with Win32::OLE or something like that. I've got no experience with that, though.

--bwana147

Replies are listed 'Best First'.
Re: Re: Re: changing the file extension
by wine (Scribe) on Jun 13, 2001 at 22:03 UTC
    A good open source tool for converting even very complicated word documents is wvware.

    From http://www.wvware.com:

    wv is a library which allows access to Microsoft Word files. It can load and parse the word 2000, 97, 95 and 6 file formats.

    Provided with the wv distributions is an application caled wvWare. wvWare is a "power-user" application with lots of command-line options, doo-dads, bells, and whistles. Less interesting, but more convenient are the helper scripts that utilize wvWare.

    I've used this program myself for quite awhile, it's worth looking into.