in reply to Snarfing data from microsoft word

Greetings,

On Win32, you may want to consider:

use Win32::OLE; my $word=Win32::OLE->new('word.application'); my $doc=$word->Documents->Open('C:\foo\bar.doc'); my $text=$doc->{Text}; # $text=~s/\r/\n/g;
OK, so this yelds all the text of the document. If the text is inside text boxes (for instance) it will not be in $text.
But $doc contains the entire DOM, so you can get at every part of the document, and churn it to your heart's content... so to speak. Your mileage will vary according to the complexity and variability of the structure of your document, etc.
perl -MWin32::OLE -d -e 42
will be your friend...
Cheers,
alf
You can't have everything: where would you put it?

Replies are listed 'Best First'.
Re: Re: Snarfing data from microsoft word
by McD (Chaplain) on Sep 18, 2002 at 17:36 UTC
    I had to use:

    my $text=$doc->{Content}->{Text};
    ...but after that it works like a charm, and just happens to be exactly the code I need today!

    Peace,
    -McD