As promised, here is the code to read text included in a PowerPoint presentation - it's approx. the same as the one I'd submitted in a node earlier. I also included here the function call so that you can get an idea on how to use the function.
Of course, it's easy to access many more properties and deal with extra objects. It shouldn't be too hard to access PowerPoint slides' notes...
package readPowerPoint; use strict; use Win32::OLE qw(in with); use Win32::OLE::Const 'Microsoft PowerPoint'; use Win32::OLE::Const 'Microsoft Office'; use Data::Dumper; require Exporter; use vars qw(@ISA @EXPORT); @ISA = qw(Exporter); @EXPORT = qw(readActivePPT); sub readActivePPT # this method returns the name of the active PPT, its content and a ha +sh # the PPT's properties we consider to be useful. Hence it is ready fo +r use { if ( my $powerpoint = Win32::OLE->GetActiveObject('Powerpoint.Appli +cation')) { if (defined (my $activePPT=$powerpoint->ActivePresentation)) { my $PPTName = $activePPT->Name; # retrieve PPT name i.e. subj +ect my $PPTProperties = $activePPT->{BuiltInDocumentProperties}; my %PPTContent = (); # read PPT content - see below for struc +ture details for my $slideNumber(1..$activePPT->{Slides}->Count) { my $total = $slideNumber / $activePPT->{Slides}->Count; print "\n---------------- Please wait (".100*$total."% don +e)----------------"; my $slide = $activePPT->slides($slideNumber); if ($slide->{Shapes}->Count) { for my $shapeNumber(1..$slide->{Shapes}->Count) { if ( ($slide->shapes($shapeNumber)->HasTextFrame()== +msoCTrue) or($slide->shapes($shapeNumber)->HasTextFrame()== +msoTrue)) { my $wordGroup=$slide->shapes($shapeNumber)->TextFrame->{T +extRange}; if (defined $wordGroup) { my $text = ""; foreach my $palabre (in $wordGroup->Words) # prints ou +t every single word { $text.= lc($palabre->{Text})." "; # lc stands for l +ower case } $PPTContent{("slide".$slideNumber)}{("shape".$ +shapeNumber)} = $text; } } } } } return ($PPTName,$PPTProperties,%PPTContent); } } return (undef,undef,undef); } # structure of $PPTContent will be a hash of hashes : # first hash will contain all the slides and each slide will be a hash + itself containing all # the text areas found. # to access do this $PPTContent{slide1}{shape1}
And here is the bit of code used to call the function above...
(my $PPTTitle, my $PPTProperties, my %PPTContent) = readPowerPoint::re +adActivePPT(); if ( (defined $PPTTitle) and (defined $PPTProperties) and (defined +%PPTContent) ) { foreach my $prop (in $PPTProperties) # #prints out every single +property { print "\n".$prop->name.":".$prop->Value; } my $text=""; foreach my $slide (keys %PPTContent) { $text.= "\n$slide : "; foreach my $shape (keys %{$PPTContent{$slide}}) { $text.="\n $shape content = $PPTContent{$slide}{$shap +e}"; } } } else { print ("PowerPoint is open but there doesn't seem to be an activ +e PPT presentation open."); }