That's how you get the words of a word document out and into an array. You may prefer a different data structure, but again, I'll leave that up to you! I hope this helps.#!/usr/bin/perl # general use directives use strict; use warnings; # project specific use directives # this comes with the standard ActiveState # distribution. You can also look for # a newer version with PPM use Win32::OLE; my $wd; # get the document # use the full path eval { $wd = Win32::OLE->GetObject('C:/pathto/document/foo.doc') }; die "Unable to load document\n" if $@; # all of the Word document data members I'm using # are explained in the MSDN documentation of the # external interfaces of a Word Document. # if you have MSDN, search for "Word OLE". # get the number of paragraphs my $paraCount = $wd->{Paragraphs}->Count; # set the counter my $foo = 0; my @words; while ($foo++ < $paraCount) { push @words, split /\s/, $wd->{Paragraphs}{$foo}{Range}{Text}; } #clean up at the end undef $wd;
In reply to Re: (Get text of Word Document)going through a Win32 MSWORD doc
by buzzcutbuddha
in thread going through a Win32 MSWORD doc
by Clancy
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |