blacksmith has asked for the wisdom of the Perl Monks concerning the following question:

Is there any way to change not only to file extension of a file but also the format. In other words, what I am trying to do is change a .txt file to a .doc file. I am able to change the extension on the file, but the content of my file is then changing in to garbage. Blacksmith.

Replies are listed 'Best First'.
Re: changing the file extension
by petdance (Parson) on Jun 13, 2001 at 19:15 UTC
    Short answer: No.

    Long answer: Renaming the file itself is trivial. Check out the rename() function. However, converting from text to Microsoft Word format is something entirely different. You'll want to look at the RTF modules on CPAN.

    xoxo,
    Andy

    %_=split/;/,".;;n;u;e;ot;t;her;c; ".   #   Andy Lester
    'Perl ;@; a;a;j;m;er;y;t;p;n;d;s;o;'.  #   http://petdance.com
    "hack";print map delete$_{$_},split//,q<   andy@petdance.com   >
    
Re: changing the file extension
by mikeB (Friar) on Jun 13, 2001 at 23:40 UTC
    On a windows machine, you could use Word directly to open a file in one format and save it in another. See the Win32::OLE module.
Re: changing the file extension
by blacksmith (Hermit) on Jun 13, 2001 at 19:33 UTC
    Actually I meant to say that I am trying to change a .doc to a .txt, but either way thanks I am looking into the RTF modules at CPAN now. Blacksmith.

      There are tools out there that allow you to convert the Word format into something else, and possibly something you can later convert into text/plain. antiword is one of them and does a pretty good job.

      Unfortunately, I don't know of any free tool that can handle the bewildering complexity of a document laden with tables, pictures, OLE objects, bells, whistles, whatever. And what will become of your converter when they "upgrade" to MS Word 200x?

      The safest would be to have access to an instance of Word itself and drive it from your perl program with Win32::OLE or something like that. I've got no experience with that, though.

      --bwana147

        A good open source tool for converting even very complicated word documents is wvware.

        From http://www.wvware.com:

        wv is a library which allows access to Microsoft Word files. It can load and parse the word 2000, 97, 95 and 6 file formats.

        Provided with the wv distributions is an application caled wvWare. wvWare is a "power-user" application with lots of command-line options, doo-dads, bells, and whistles. Less interesting, but more convenient are the helper scripts that utilize wvWare.

        I've used this program myself for quite awhile, it's worth looking into.