You could use Win32::OLE to make Word tell you. I did somthing like this to list all properties of all documents in my documents tree. Performance isn't to good though, about 3 documents per second on my sturdy old P233 with Win2k. Here's my code anyway. Hope it will help.
#!/usr/bin/perl -w # Uses use strict; use Win32::OLE; use Win32::OLE::Variant; use Win32::OLE::Const; use File::Find; # We want to handle collections Win32::OLE->Option(_NewEnum => 1); # Variables use vars qw($MSWord $wd $startdir); # Where to start the doc search $startdir='d:/documents'; # Create new MSWord object and load constants $MSWord=Win32::OLE->new('Word.Application','Quit') or die "Could not load MS Word"; $wd=Win32::OLE::Const->Load($MSWord); # Find documents find(\&getProps,$startdir); ###################################### sub getProps { # Find sub # We only want .doc files return unless /\.doc/ && -f; # No OLE warnings please local $Win32::OLE::Warn = 0; # Open document my $doc = $MSWord->Documents->Open({FileName=>$File::Find::name}); # Exit nicely if we couldn't open doc return unless $doc; # Print header print "\n-----------------------------------------------\n"; print "Document: $File::Find::name\n"; print "-----------------------------------------------\n"; # List document properties foreach my $prop (@{$doc->BuiltInDocumentProperties->{_NewEnum}}, @{$doc->CustomDocumentProperties->{_NewEnum}}) { if (defined $prop->{Name}) { print $prop->{Name}; if (defined $prop->{Value}) { # Variants... if ($prop->{Value}=~/^Win32::OLE::Variant/) { print ": ".valof $prop->{Value}; } else { print ": ".$prop->{Value}; } } else { print ": undefined" } print "\n"; } } # Close document $doc->Close({SaveChanges=>$wd->{wdDoNotSaveChanges}}); }


/brother t0mas

In reply to Re: Parsing an MS Word document by t0mas
in thread Parsing an MS Word document by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.