Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Using OLE to view given Paragraph in MS Word Document

by Ray Smith (Beadle)
on Nov 21, 2011 at 18:49 UTC ( [id://939296]=perlquestion: print w/replies, xml ) Need Help??

Ray Smith has asked for the wisdom of the Perl Monks concerning the following question:

I currently successfully parse MS Word Documents, extracting paragraph style and text. However, I've been unsuccessful in displaying a Word Document, given a specified paragraph number.

My sample program, demonstrates my problem - I can "read through the Word document" using enumerate->Next() (I'd rather position directly but Skip() doesn't seem to work), and although it appears that I get to the desired paragraph, the display does not appear.

I see that Selection may be what I want but I can't figure how to make that work. I lack the VBA documentation. And when I see some samples, I have not been successful in translating them to Perl / OLE calls.

Thanks for your attention.

#!/usr/bin/perl -w # Simple case to open MS Word Document and view Nth paragraph use strict; use warnings; use Win32::OLE; use Win32::OLE::Enum; use Cwd qw(getcwd abs_path); my $ParaNo = 10; # Default target paragraph my $InFile = shift if @ARGV > 0; # Required file name my $app_name = "Word.Application.8"; # Word's application name my $app; eval {$app = Win32::OLE->GetActiveObject($app_name)}; # Use instanc +e if already running die "Word ($app_name) is not installed" if $@; if (!defined($app)) { $app = Win32::OLE->new($app_name, sub {$_[0]->Quit;}) || die "Could not connect to $app_name $!"; } $app->{'Visible'} = 1; my $abspath = abs_path($InFile); # Word appears to need absolute pa +th my $doc = $app->Documents()->Open({ FileName => $abspath, ReadOnly => 0, }); die "Can't open doc $abspath: $!" if !defined($doc); my $paragraphs = $doc->Paragraphs(); my $enumerate = new Win32::OLE::Enum($paragraphs); if (!defined($enumerate)) { die "Can't get enumerate for $InFile"; } my $paragraph; for (my $i = 0; $i<$ParaNo; $i++) { $paragraph = $enumerate->Next(); } my $style = $paragraph->{Style}->{NameLocal}; my $text = $paragraph->{Range}->{Text}; print "style=$style text=$text\n"; print "Why doesn't the view show this location?\n"; print "ENTER to quit\n"; my $ans = <>;

Replies are listed 'Best First'.
Re: Using OLE to view given Paragraph in MS Word Document
by ricDeez (Scribe) on Nov 21, 2011 at 21:37 UTC

    I managed to get this to work with the changes made as per below:

    #!/usr/bin/perl -w # Simple case to open MS Word Document and view Nth paragraph use strict; use warnings; use 5.012; use Win32::OLE; use Win32::OLE::Enum; use Cwd qw(getcwd abs_path); my $ParaNo = 10; # Default target paragraph # my $InFile = shift if @ARGV > 0; # Required file name ##################################################################### # For the purposes of testing, I hard-coded the file name and path ##################################################################### my $InFile = "C:/Users/Ric/Desktop/Report-WirelessSurvey.doc"; ##################################################################### # The following makes the code less portable, requiring $app_name to # be modified accordingly! ##################################################################### # my $app_name = "Word.Application.8"; # Word's application nam +e # my $app; ##################################################################### # This approach will use the active instance or will open word if # required ##################################################################### my $doc = Win32::OLE->GetObject ( $InFile ) or die "Could not load $InFile. \n"; my $app = $doc->{Application}; $app->{Visible} = 1; ##################################################################### # This is a good idea ##################################################################### $app->{DisplayAlerts} = 0; # eval {$app = Win32::OLE->GetActiveObject($app_name)}; # Use insta +nce if already running # die "Word ($app_name) is not installed" if $@; # if (!defined($app)) { # $app = Win32::OLE->new($app_name, sub {$_[0]->Quit;}) # || die "Could not connect to $app_name $!"; # } # $app->{'Visible'} = 1; # my $abspath = abs_path($InFile); # Word appears to need absolute +path # my $doc = $app->Documents()->Open({ # FileName => $abspath, # ReadOnly => 0, # }); # die "Can't open doc $abspath: $!" if !defined($doc); ##################################################################### # Why are you using Win32::OLE::Enum? ##################################################################### my $paragraphs = $doc->Paragraphs(); # my $enumerate = new Win32::OLE::Enum($paragraphs); # if (!defined($enumerate)) { # die "Can't get enumerate for $InFile"; # } my $paragraph; # for (my $i = 0; $i<$ParaNo; $i++) { for my $i ( 1 .. $paragraphs->Count()){ last if $i > $ParaNo; #Forgot that you wanted to stop here! $paragraph = $paragraphs->Item( $i ); ################################################################## # This bit needs to be in the loop! ################################################################## my $style = $paragraph->{Style}->{NameLocal}; my $text = $paragraph->{Range}->{Text}; print "style=$style text=$text\n"; # print "Why doesn't the view show this location?\n"; # print "ENTER to quit\n"; # my $ans = <>; # $paragraph = $enumerate->Next(); }

    Try these changes and let me know how you go. This may still trip up on unicode characters!

      Thanks for the example.

      I tried it, first:
      1. using my own test file.
      2. Changing to use 5.10, because that's what I have.
      3. use abs_path(input) file because Word appears to require absolute path.

      Things operate with out error, but my Windows display still leaves the cursor at the beginning of the file.

      Am I missing something here?

        I don't really understand what you want to do!

        If you need to view the paragraphs being selected you could add the following:

        for my $i ( 1 .. $paragraphs->Count()){ last if $i > $ParaNo; $paragraph = $paragraphs->Item( $i ); $paragraph->{Range}->Select(); # <<<<<Added sleep(1); # <<<<<Added my $style = $paragraph->{Style}->{NameLocal}; my $text = $paragraph->{Range}->{Text}; print "style=$style text=$text\n"; }

        I have used placed the sleep in the loop so that you can see the paragraphs being selected in turn, otherwise it would just happen too quickly - especially since you are only interested in the first 10 paragraphs!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://939296]
Approved by Corion
Front-paged by derby
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-25 05:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found