in reply to Open a Microsoft Word Doc and Save as plain text file

I'm not aware of a pure-Perl solution for parsing MS Word documents. A search on http://freshmeat.net turns up a few programs that can do the conversion, or you can use blue_cowdawg's suggestion of Win32::OLE if you're on a Windows system with Word (and you trust the documents not to do anything nasty, like take over the computer where your script is running).

File::Type can help you identify the type of the file.

  • Comment on Re: Open a Microsoft Word Doc and Save as plain text file

Replies are listed 'Best First'.
Re^2: Open a Microsoft Word Doc and Save as plain text file
by soon_j (Scribe) on Jun 08, 2006 at 15:45 UTC
    Thank you monks for your inputs.

    I found this code from CGI to be very good in determining the type of file

    my $cgi = new CGI; my $file = $cgi->param('file'); my $type = $cgi->uploadInfo($file)->{'Content-Type'};
      Ah, yes, good idea. It does rely on the client to identify the document, though; if the user's Web browser doesn't know what the file is, it will probably be reported to you as application/octet-stream or text/plain if it smells like text.