Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How do I apply formatting to a particular word in a docx file using Win32::Ole in Perl?

by ZJ.Mike.2009 (Scribe)
on Sep 23, 2010 at 03:29 UTC ( #861439=perlquestion: print w/replies, xml ) Need Help??

ZJ.Mike.2009 has asked for the wisdom of the Perl Monks concerning the following question:

For example, my docx file contains the following sentences:

This is a Perl example
This is a Python example
This is another Perl example

I want to apply bold style to all the occurrences of the word "Perl" like so:

This is a Perl example
This is a Python example
This is another Perl example

I've so far come up with the following script:
use strict; use warnings; use Win32::OLE::Const 'Microsoft Word'; my $file = 'E:\test.docx'; my $Word = Win32::OLE->new('Word.Application', 'Quit'); $Word->{'Visible'} = 0; my $doc = $Word->Documents->Open($file); my $paragraphs = $doc->Paragraphs() ; my $enumerate = new Win32::OLE::Enum($paragraphs); while(defined(my $paragraph = $enumerate->Next())) { my $text = $paragraph->{Range}->{Text}; my $sel = $Word->Selection; my $font = $sel->Font; if ($text =~ /Perl/){ $font->{Bold} = 1; } $sel->TypeText($text); } $Word->ActiveDocument->Close ; $Word->Quit;
But it has applied bold style to the whole paragraph and it does not edit the sentences in their original place. It gives me both the modified version and the original version like this:

This is a Perl example
This is a Python example
This is another Perl example
This is a Perl example
This is a Python example
This is another Perl example

How should I fix my problem. Any pointers? Thanks like always :)

(This question has been cross-posted at stackoverflow.)

Problem solved! thanks to @NetWallah and @sflitman

with the help from Zaid@stackflow and cjm@stackflow, I've finally solved the problem :) Here's the code that works lovely:

while ( defined (my $paragraph = $enumerate->Next()) ) { my $words = Win32::OLE::Enum->new( $paragraph->{Range}->{Words} ); while ( defined ( my $word = $words->Next() ) ) { my $font = $word->{Font}; $font->{Bold} = 1 if $word->{Text} =~ /Perl/; } }
  • Comment on How do I apply formatting to a particular word in a docx file using Win32::Ole in Perl?
  • Select or Download Code

Replies are listed 'Best First'.
Re: How do I apply formatting to a particular word in a docx file using Win32::Ole in Perl?
by NetWallah (Canon) on Sep 23, 2010 at 04:56 UTC
    You need to use the Range.Words Property to scan the words in the $paragraph->{Range}.

    Something like this (Untested, uncompiled):

    for my $w ($paragraph->{Range}->Words()){ next unless $w->Text() =~/perl/i; $w->Font->{Bold} = 1; }

         Syntactic sugar causes cancer of the semicolon.        --Alan Perlis

      @NetWallah, thanks for the guidance. But perl throws me an error suggesting that $w->Text returns an uninitialized value. I've used the following lines of code to check for the keys of the $w hashref:
      my @keys = keys %{$w}; print @keys,"\n";
      And I get the following hash keys: Count First Last Application Creator Parent

      It seems I can only use something like $w->First->Text. But I'm stuck there :(

Re: How do I apply formatting to a particular word in a docx file using Win32::Ole in Perl?
by sflitman (Hermit) on Sep 23, 2010 at 05:06 UTC
    I think you shouldn't call TypeText since that is what is adding the original paragraph. The Selection is probably the paragraph at that point, which is why the whole line gets bold. Set the selection using $Word->MoveRight (check the API, I believe you can select a word at a time with this method, at least when called from Visual Basic).

    HTH,
    SSF

Re: How do I apply formatting to a particular word in a docx file using Win32::Ole in Perl?
by NetWallah (Canon) on Sep 25, 2010 at 16:57 UTC
    The solution you have posted is somewhat clunky - in that it does not read like "normal" perl.

    Here is a tested, more perlish alternative:

    use strict; use warnings; use Win32::OLE qw(in); # "in" provides the enumeration mechanism# use Win32::OLE::Const 'Microsoft Word'; my $file = 'C:\Users\Netwallah\My Documents\test.docx'; my $WordApp = Win32::OLE->new('Word.Application', 'Quit'); #$WordApp->{'Visible'} = 0; my $doc = $WordApp->Documents->Open($file) or die("Unable to open document '$file':", Win32::OLE->LastError()); for my $para (in $doc->Paragraphs()){ for my $CurrentWord (in $para->{Range}->Words()){ next unless $CurrentWord->Text() =~/perl/i; $CurrentWord->Font->{Bold} = 1; } } $doc->Save(); $WordApp->ActiveDocument->Close ; $WordApp->Quit;

         Syntactic sugar causes cancer of the semicolon.        --Alan Perlis

      hi i am facing the same problem ... is there any win32:OLE extra line we need to add for docx file,actually the perl installed on winxp but the ms office plug in for docx is no there. is that a problem>? thanking sudeep
        Which "same" problem are you facing ? Please state your problem.

        In order to use this (Win32::OLE based) mechanism to manage MS office documents, you will need to install the appropriate software - in this case - MS Word, on the machine where you run the code that calls Win32::OLE. If you install Word 2003, you need to install additional MS plug-ins so it can manage "docx" files.

                    "XML is like violence: if it doesn't solve your problem, use more."

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://861439]
Approved by ahmad
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2023-02-05 21:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer not to run the latest version of Perl because:







    Results (33 votes). Check out past polls.

    Notices?