in reply to Re: OLE Find with Paragraphs()
in thread OLE Find with Paragraphs()

What I am trying to do is verify that certain headings exist and are in the correct order within this Word doc. This is just the first step in this program. Next I will need to copy certain sections from another Word document into this one under some of the headings. I was trying to avoid doing a text conversion because I will need the finished word document to retain all its headings and styles. Is there any way in Perl to search a Word doc and return the line number in the Word doc of the text I was looking for? Any help would be greatly appreciated.

Replies are listed 'Best First'.
Re^3: OLE Find with Paragraphs()
by tachyon (Chancellor) on Jun 14, 2004 at 23:29 UTC

    Word docs don't really have line numbers in the same way a text file does. Here is a search and replace sub that may point you in the correct direction. Another option is RTF format. It is easy to munge with Perl and REs but can retain most general formattion. YMMV.

    sub word_find_and_replace { my ( $word, $rel_file_path, $tokens_ref ) = @_; # first make a temporary file to do the search and replace on my ( $fh, $temp_name ) = get_tempfile( "$DOC_DIR/system", 'doc' ); close $fh; my $content_ref = read_file( "$DOC_DIR/$rel_file_path" ); create_file( "$DOC_DIR/system/$temp_name", $content_ref, 'overwrit +e ok' ); $word->{visible} = 0; my $doc = $word->{Documents}->Open("$DOC_DIR/system/$temp_name"); my $search_obj = $doc->Content->Find; my $replace_obj = $search_obj->Replacement; for my $token ( keys %$tokens_ref ) { my $find = '<?' . $token . '?>'; my $replace = $tokens_ref->{$token}; # now i know this looks wierd but M$ word (at least 2000) want +s \r # as the para marker not \r\n or even \n if you send \n you ge +t little # binary squares..... oh well that's M$ for you. $replace =~ s/\r\n|\n/\r/g; # this makes it work properly. GO +K $search_obj->{Text} = $find; $replace_obj->{Text} = $replace; $search_obj->Execute({Replace => $wdReplaceAll}); } $doc->Save; $doc->Close; # now get the data out of the modified temp file $content_ref = read_file( "$DOC_DIR/system/$temp_name" ); # remove our unwanted temp files and objects unlink "$DOC_DIR/system/$temp_name"; undef $search_obj; undef $replace_obj; undef $doc; return $content_ref; }

    cheers

    tachyon