captain.moor has asked for the wisdom of the Perl Monks concerning the following question:

Hi there, I have around 50,000 or so word documents and I would like to write a script to sort these files by author. The author is held as a extended file attribute and so far I have not been able to find a way to access this data. I have tried using the following Win32::OLE script to extract these values. However under Windows 7 this does not return the correct information. I am unsure wither this is because Windows 7 has changed the file attribute structure, or because this method does not allow me access to this data.
my $directory = $ARGV[0] || die "$pname requires directory as first ar +gument\n"; find(\&main2, $directory); sub main2{ if(-f $File::Find::name){ if(extention($File::Find::name) eq '.pdf'){ #print "File: $_ Extention: " . extention($_) . "\n"; my $shell = Win32::OLE->new("Shell.Application") or die; my $folder = $shell->NameSpace("$File::Find::dir") or die; my $file = $folder->ParseName("$_") or die; for my $i (0..50){ my $attrib = $folder->GetDetailsOf($file, $i); print "$i) $_ >> $attrib\n" if ! $attrib eq ''; } print "-------------------------------------------\n"; } } }
I want to be able to access this value on pdf, doc & excel files types so if you could point me in the direction of a module or thread on how I can achieve this I would be forever grateful. Thanks and looking forward to your response. Tim

Replies are listed 'Best First'.
Re: Word Document Accessing Extended Attributes - Author
by keszler (Priest) on Oct 31, 2009 at 12:39 UTC
    It appears to be a change in Windows 7. The following worked on XP64.
    G:\>type x.pl use Win32::OLE; my $shell = Win32::OLE->new("Shell.Application") or die; my $folder = $shell->NameSpace("G:\\") or die; my $file = $folder->ParseName("X.doc") or die; for my $i (0..50){ my $attrib = $folder->GetDetailsOf($file, $i); print "$i) $_ >> $attrib\n" if ! $attrib eq ''; } print "-------------------------------------------\n"; G:\>x.pl 0) >> X.doc 1) >> 267 KB 2) >> Microsoft Word Document 3) >> 10/31/2009 8:13 AM 4) >> 10/31/2009 8:12 AM 5) >> 10/31/2009 12:00 AM 6) >> A 7) >> Online 8) >> Everyone 9) >> Scott R. Keszler 10) >> Test Document 13) >> 8 31) >> 8/3/2009 1:13 PM -------------------------------------------
    OLE::Storage might work on Win7:
    G:\>ppm install ole-storage G:\>ppm install unicode-map G:\>ppm install startup G:\>which ldat C:\Perl64\site\bin/ldat G:\>ldat X.doc Processing "X.doc" # Microsoft Office Word Document (Word.Document.8, 31.10.2009, 12:13:0 +5, rev 10 ) Title: Test Document Authress: Scott R. Keszler Organization: SRK Consulting Application: Microsoft Office Word Template: Normal.dot Created: 03.08.2009, 17:13:00 Last saved: 31.10.2009, 12:13:00 Done.