in reply to To Read and Edit docx files in Windows 7

There is no need for a module, as you can use the OLE interface to interact directly with Word. Any module that wraps (or replaces) this functionality would have to be updated for each new version of Office that comes out.

This code should get you started on the right track (though this is written for Word 2013, as that is what I have. You should be able to just change the object library version number to get it to work with your version of Word

use 5.16.2; use Win32::OLE; use Win32::OLE::Const 'Microsoft Office 15.0 Object Library'; my $word = Win32::OLE->new( 'Word.Application', 'Quit' ); my $doc = $word->Documents->Open( 'C:\Temp\OLE\Word\test.docx' ) || d +ie 'Unable to open document: ', Win32::OLE->LastError; my $paragraphs = Win32::OLE::Enum->new( $doc->Paragraphs ); while ( defined( my $paragraph = $paragraphs->Next ) ) { my $words = Win32::OLE::Enum->new( $paragraph->{Range}->{Words} ); while ( defined( my $word = $words->Next ) ) { $word->{Text} =~ s/([Hh])i/$1ello/; } } $doc->Save; $doc->Close;

You may also find this helpful: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word%28v=office.14%29.aspx

Replies are listed 'Best First'.
Re^2: To Read and Edit docx files in Windows 7
by DVCHAL (Novice) on Dec 15, 2014 at 10:25 UTC
    Hi Simon! Thanks for the reply, My code works but it throws Error as shown below ,

    Win32::OLE(0.1709) error 0x80010108: "The object invoked has disconnected from its clients" in METHOD/PROPERTYGET "Quit" at Author_doc_read_new.pl line 0 eval {...} called at Author_doc_read_new.pl line 0 eval {...} called at Author_doc_read_new.pl line 0

    What may be the cause of it? Below is my code snippet:
    use Win32::OLE; use Win32::OLE::Enum; use Win32::OLE::Const 'Microsoft Office 15.0 Object Library'; use Win32::OLE::Const 'Microsoft Word'; #$tm = localtime; #print "$tm\n"; #Create and Open the Text file to Write open(OUTFILE2,">Author_name_extract.txt") or die("Cant open Output fil +e\n"); ### open Word application and add an empty document ### (will die if Word not installed on your machine) my $word = Win32::OLE->new('Word.Application', 'Quit') or die; $word->{Visible} = 0; @filesnames = glob '*.docx'; #@filesnames = "AR765_Maint_Code_repositoryUINT32.docx"; foreach $count (@filesnames) #Loop till the end is reached { if($count !~ /^~\$/) { print "$count\n"; $filename = "D:\\MRJ_BCU\\Perl\\From thejaswini\\doc_read\\$co +unt"; #my $document = $word->Documents->open($filename) || die 'Unab +le to open document: ', Win32::OLE->LastError; my $document = $word->Documents->open($filename)|| die 'Unable + to open document:'; open(OUTFILE1,">File_under_Review.txt") or die("Cant open Outp +ut file\n"); print "Extracting Text from $filename...\n"; $paragraphs = $document->Paragraphs(); $enumerate = new Win32::OLE::Enum($paragraphs); while(defined($paragraph = $enumerate->Next())) { $a = $paragraph->{Range}->{Text}; print OUTFILE1 "$a\n"; } close(OUTFILE1); $document->Save; $document->Close; # Open the Converted Text file to read the Pattern. open(INFILE,"<File_under_Review.txt") or die("Can't open f +ile specified\n"); while($a = <INFILE>) { if($a !~ /\S/) { ; } else { $b = $a; if($a =~ /Date:/) { $a =~ /\s*\S*\s*Date:\s*(\d*\/\d*\/\d*)\s*/; $a= $1; $a =~ s/\s*//g; $a =~ s/_*//g; print OUTFILE2 "$count\t"; print OUTFILE2 "$a\t"; } if($b =~ /Review Moderator:/) { $b =~ /\s*\S*\s*Review Moderator:\s*(\w+\s?.?\w*)\ +s*Date:/; $b=$1; $b =~ s/\s{2,}//g; $b =~ s/_*//g; print OUTFILE2 "$b\n"; } } } close(INFILE); #To Delete the Temp converted text File unlink("File_under_Review.txt"); } else { print "Corrupted File: $count\n"; } } #To Quit the Word Application $word->Quit(); #close the Output text file used to write close(OUTFILE2); $tm = localtime; print "$tm\n";

      Firstly, I highly recommend using use strict; and use warnings; or use version; (replacing version with your Perl version number). Especially when using Win32::OLE.

      Next, I highly suspect $word->Quit(); is not necessary, as Word should already close when you close the last document you have open. This is (I think) the most likely source of the error you are experiencing.

        When I use use strict or use Version I get the following Error

        Global symbol "@filesnames" requires explicit package name at Author_doc_read_ne w.pl line 31. BEGIN not safe after errors--compilation aborted at Author_doc_read_new.pl line 31.

        This Error is against the line @filesnames = glob '*.docx' , How to get rid of this error?