in reply to Win32 Word behaving strangely

the reason of slowness is in your seemingly innocent  Item($i)

This construct in MS-Word is not an array-like access, but rather it goes every time from 1 to $i to access $i-th element

This gives access time O(N) instead of O(1), and your loop O(N^2), which is bad.

Plus you use this several times in a loop.

A typical workaround - use something like that:

use Win32::OLE qw(in); @items = in $doc->Words->Items;

Been there, seen that :)

Replies are listed 'Best First'.
Re^2: Win32 Word behaving strangely
by cormanaz (Deacon) on Jul 28, 2006 at 20:48 UTC
    I knew it had to be something like that. But I tried your fix, a la
    use strict; use Win32::OLE qw( in ); use Win32::OLE::Const 'Microsoft Word'; my $filename = 'E:\test.doc'; my $word = Win32::OLE->new('Word.Application', 'Quit'); my $doc = $word->Documents->Open($filename) || die("Unable to open doc +ument ", Win32::OLE->LastError()); my $nwords = $doc->Words->Count; my @wordtext; my @wordcolor; my $starttime = time; my @items = in $doc->Words->Items; for(my $i = 1; $i <= $nwords; $i++) { $wordcolor[$i] = $items[$i]->HighlightColorIndex; $wordtext[$i] = $items[$i]->Text; }

    and @items comes back null.

      try 'in $doc->Words;' instead of 'in $doc->Words->Items;'
      'in' do not always work good...
      But you can try work at paragraph level, so
      my @paras = in $doc->Paragraphs;# or so

      BTW even $doc->Words->Count will traverse entire document content... so be careful!

      Also I would suggest you to slurp entire document content... but then it will be difficult to get access to word in the middle... so this may be not a good advice...