in reply to Accessing Meta data from MS WORD
/me nods...
IIRC, docx is an XML-formatted file with a well-known public schema, zip-compressed. If you do not already find a CPAN module to do what you want, an approach could be to write code that unzips it, then attacks the XML content using XPath expressions ... thus avoiding the need to write code to match the XML internal structure. But it is extremely likely that what you are doing is “a thing already done.”