in reply to Win32::OLE for MS-Word

Hello,
thanks a lot for your reply
I just tried out your first section of codei.e. use Word
to save a .doc file as a .txt file
Here's the code:
sub save_doc_as_text { my ( $infile, $outfile ) = @_; print "\n$infile"; print "\n$outfile"; require Win32::OLE; my $word = Win32::OLE->new( 'Word.Application', sub {$_[0]->Quit;}); error( "Can't create new instance or Word Reason:$Win32::OLE::LastErro +r" ) unless $word; $word->{visible} = 0; my $doc = $word->{Documents}->Open($infile); error( "Can't open $infile, Reason:$Win32::OLE::LastError" ) unless $d +oc; # wdFormatDocument wdFormatText wdFormatHTML $doc->SaveAs( { FileName => $outfile, FileFormat => $wdFormatText}); + $doc->Close; undef $doc; undef $word;} $inf="ole-word-demo-3.doc"; $outf="ole-word-demo-3.txt"; save_doc_as_text($inf,$outf);
The file ole-word-demo-3.doc is located in the folder
in which I've stored the above perl script, and executed
it.Well no ole-word-demo-3.txt file gets created in that
folder at all.What could be the problem?
Also, I'm not sure what paras we need to pass to the
second subroutine(find and replace)
Actually I need to save a .doc file as a text file,then
tokenize the text file and process the words in it in some
way, then convert the .txt file to .doc again
Also(please treat this as an ignorant soul asking a silly
question) is there any remote possibility that a .doc
file can be made to open in a Tk text widget?
Thanx and regards

Replies are listed 'Best First'.
Re: Re: Win32::OLE for MS-Word
by WhiteBird (Hermit) on Jun 12, 2003 at 16:08 UTC
    The file ole-word-demo-3.doc is located in the folder in which I've stored the above perl script, and executed it.Well no ole-word-demo-3.txt file gets created in that folder at all.What could be the problem?

    When I test this out, I get an error message indicating that the ole-word-demo-3.doc file cannot be found. If I hard code the location of the document as:

    $inf="C:\\some_dir\\ole-word-demo-3.doc"; $outf="C:\\some_dir\\ole-word-demo-3.txt";
    then the program works to open the specified .doc file and creates the text file. As for the rest of your task, I hope some other Monk will have some insight.
      Yes! It works now and "ole-word-demo-3.txt" gets created.
      The Word formatting characters also appear as unreadable
      stuff in the .txt file though, I suppose they can be
      stripped off.
      For doing the reverse I guess we need to
      replace "wdFormatText" by "wdFormatDocument" in this
      statement:
      $doc->SaveAs( { FileName => $outfile, FileFormat =>wdFormatText
      However I'm not familiar with the Word constants, so I'm
      not sure about the rest of the statements:(
      What are the other changes we need to make to this code?
      Thanks in advance.
        You might want to look at adding the following after your require statement:

        use Win32::OLE::Const 'Microsoft Word';

        And then, changing this line:
        $doc->SaveAs( { FileName => $outfile, FileFormat => $wdFormatText});

        To:

        $word->WordBasic->FileSaveAs("$outfile", wdFormatText);

        This will remove the extra characters in the text file. As far as re-saving the document and restoring the formatting, that is another question altogether. Have you read the documentation on OLE? There's helpful information at Active State. Do a search there for all of ActiveSite on OLE as well. While there's not alot of easy to find info out there, doing a web search on "OLE" and "Perl" and "Word" will find some useful pieces for your puzzle. David Roth has a couple of excellent books out there for reference.

        Also, have you tried resaving your converted file back into Word yet and has that worked? What's the end goal for this snippet?