in reply to changing the file extension

Actually I meant to say that I am trying to change a .doc to a .txt, but either way thanks I am looking into the RTF modules at CPAN now. Blacksmith.

Replies are listed 'Best First'.
Re: Re: changing the file extension
by bwana147 (Pilgrim) on Jun 13, 2001 at 20:43 UTC

    There are tools out there that allow you to convert the Word format into something else, and possibly something you can later convert into text/plain. antiword is one of them and does a pretty good job.

    Unfortunately, I don't know of any free tool that can handle the bewildering complexity of a document laden with tables, pictures, OLE objects, bells, whistles, whatever. And what will become of your converter when they "upgrade" to MS Word 200x?

    The safest would be to have access to an instance of Word itself and drive it from your perl program with Win32::OLE or something like that. I've got no experience with that, though.

    --bwana147

      A good open source tool for converting even very complicated word documents is wvware.

      From http://www.wvware.com:

      wv is a library which allows access to Microsoft Word files. It can load and parse the word 2000, 97, 95 and 6 file formats.

      Provided with the wv distributions is an application caled wvWare. wvWare is a "power-user" application with lots of command-line options, doo-dads, bells, and whistles. Less interesting, but more convenient are the helper scripts that utilize wvWare.

      I've used this program myself for quite awhile, it's worth looking into.