Zaphod137 has asked for the wisdom of the Perl Monks concerning the following question:

I'm in need of some Word OLE mojo.
What I'm trying to do: Search through a large Word document, find a specific table(s), copy that table into an Excel file and name the Excel file appropriately.
What I can do so far: Find the appropriate table, copy, paste it into Excel.
What I need to know how to do: The name I'll assign to the Excel file depends on the line preceding the table I copied in Word. ie:

Table Name1 blah blah blah
----------------------------
| Cell11 | Cell12 | Cell13 |
| Cell21 | Cell22 | Cell23 |
| Cell31 | Cell32 | Cell33 |
----------------------------

I copy the full table into excel, and then want to extract the string "Name1" just before it so I can name my new excel file "Name1.xls".

How do I select the line just before the table I'm focused on with Perl?

Hopefully this makes sense. Thanks!

Replies are listed 'Best First'.
Re: Word tables, OLE, perl, and me.
by CountZero (Bishop) on Feb 02, 2009 at 22:15 UTC
    If I remember well (from my Visual Basic (for Applications) days, long long ago), there were some functions in the Word API to move a cursor on a character, word, line, paragraph, ... basis. Of course I cannot remember which function it was (the memory is the first thing to go :-( ), but I think you have to look in that direction. Probably something like "Move" or "MoveUp" with a smattering of parameters and constants to indicate how far you want to move. Try the help function in the Word VB editor.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Word tables, OLE, perl, and me.
by Zaphod137 (Novice) on Feb 02, 2009 at 22:51 UTC
    I have come across "MoveUp" and attempted to use it. This seems promising, but I think I'm not understanding the syntax or context I need to use it. My example:
    my $word = CreateObject Win32::OLE 'Word.Application' or die $!; $word->{Visible} = 1; my $document = $word->Documents->Open('C:/temp/doctest.doc'); my $table = $word->ActiveDocument->Tables(2); my $selection = $table->Select; $selection->MoveUp(wdLine,1); my $str = $selection->Range(0,10)->{Text}; $table->Range->Copy; print qq{$str\n};
    This gives me an error with MoveUp "Can't call method "MoveUp" on an undefined value at ..."
    wdLine is supposed to be a known Unit as far as I know...
      It looks as if $selection did not get set. Are you sure $selection = $table->Select returns a "selection" object on which you can call the MoveUp(wdLine,1) method?

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        To get the selection you do the following:
        $table->Select; my $selection = $word->ActiveWindow->Selection;

        - John
Re: Word tables, OLE, perl, and me.
by binf-jw (Monk) on Feb 03, 2009 at 09:42 UTC
    Update: Just read the whole post.. And I had this problem as well. ( See bottom: )

    TABLE TO EXCEL I had this problem a few months ago and the following worked fine for me. The only real differences from the previous suggestions is I used the 'Range' property not the 'Range' method.
    # This was inside a wrapper hence the Excel object was stored as b +elow my $book = $self->{Excel}->Workbooks->Add; # I also had previously stored all the tables in the object my $table = $self->{Tables}->[$i]; # Copy the table range $table->{Range}->Copy(); # Add a new worksheet my $sheet = $Book->Worksheets->Add; $sheet->Paste; $self->{Excel}->{DisplayAlerts} = False; $sheet->Delete(); # Close the excel book $book->Close( { SaveChanges => False } ); # Empty clip board Win32::Clipboard->Empty();

    Without meaning to sound a pesamist this will probably be the first of many prolems. I've been working on this for last few months and other issues you'll probably face will be:
    - Merged cells: 'Only an issue if you actually want to extract the dat +a when saved as txt tables (which I was)' - Wraped cells in word: Wrapped cells will copy to excel to the number + of rows their text covers.
    E.g.:
    |----------------| | A Heading | | that is wraped | |----------------| becomes |----------------| | A Heading | |----------------| | that is wraped | |----------------| in excel

    I've solved both the above and a few others but the solutions arn't really worth going to here.
    - John

    Update: I think you can actually used both {Range} and Range to represent the property but the code I listed definately works for me. (Sorry it isn't self containing but it's quite hard to put word table object in __DATA__ ;) (I'll probably be proved wrong though) )



    TABLE TITLE: The way I did this was to have a regular expresison mathcing the title paragraph and store the start location of each title and table and then merging the two arrays based on location ( My titles weren't always neatly above the table occassionally missing ).
    For table start $table->{Range}{Start} For Paragraph start $paragraph->{Range}{Start}
Re: Word tables, OLE, perl, and me.
by Zaphod137 (Novice) on Feb 02, 2009 at 21:16 UTC
    Currently I'm using a simple document for testing, so my code is minimal. I know the table I'm looking for is the 2nd table, in the real version, I'll just have to check the first cell to make sure it has the correct word.
    my $word = CreateObject Win32::OLE 'Word.Application' or die $!; $word->{Visible} = 1; my $document = $word->Documents->Open('C:/temp/doctest.doc'); my $table = $word->ActiveDocument->Tables(2); $table->Range->Copy; ... pasted into excel...
    Please assume a high degree of Perl and Win32 OLE noobness :)
Re: Word tables, OLE, perl, and me.
by Zaphod137 (Novice) on Feb 02, 2009 at 23:49 UTC
    Heh, no I'm not sure. Unfortunately, shaking my fist at it doesn't seem to fix it.

    So, I am now able to move the cursor to the position I want with this:
    my $word = CreateObject Win32::OLE 'Word.Application' or die $!; $word->{Visible} = 1; my $document = $word->Documents->Open('C:/temp/doctest.doc'); my $table = $word->ActiveDocument->Tables(2); $table->Range->Copy; my $selection = $table->Select; $word->Selection->MoveUp(wdLine,1);
    That gets the cursor to the start of the line I want! Now I just am not sure how to pull out the full line so I can to a regular expression on it.
      Just a guess as I don't have time to test this yet.. But you should probably be using 'wdParagraph' to move up.. A think a 'line' is not what you think it is, A paragraph will be terminated by the new line where as the line is just terminated by a 'wrap'.
      For example in a multiline title: Table 1: blah blah blah blah blah blah b lah blah blah. <br>
      Moving up 1 line would only get to "lah blah blah." where as up 1 paragraph would be "Table 1: blah blah blah blah blah blah blah blah blah." You could then surely just select it as follows:
      my $text = $selection->{Range}{Paragraph}{Text};
      - John
Re: Word tables, OLE, perl, and me.
by binf-jw (Monk) on Feb 03, 2009 at 15:38 UTC
    In the spirit of Tim Toady I found another way to solve the problem:
    # Load constants my %CONSTANT = ( %{ Win32::OLE::Const->Load( "Microsoft Excel 11.0 Object Library" +) }, %{ Win32::OLE::Const->Load( "Microsoft Office 11.0 Object Library" + ) }, %{ Win32::OLE::Const->Load( "Microsoft Word 11.0 Object Library" ) + }, ); # Get the table selection $table->{Range}->Select(); my $selection = $word->ActiveWindow->{Selection}; # Get the text from the previous paragraph my $prev_text = $selection->Previous( $CONSTANT{wdParagraph} )->{Text} +;

    Personally I would (and do) put it in a loop, That tracks backwards looking for the title. Perhaps there is a few blank lines between the title and the associated table?
    # Example title regex my $title_regex = qr/ Table \s ( \d+ ) : /xms; # Check only the five previous matches ( Could also keep checking unti +l you find a match with a while ) for ( 0 .. 5 ) { # Get previous paragraph # Get Text # Exit or do somthing else if the title is found. last if $text =~ $title_regex; }

    I'll let you fill in the blanks...
    - John
Re: Word tables, OLE, perl, and me.
by jdporter (Paladin) on Feb 02, 2009 at 21:06 UTC

    Could you please show the code you use to find the table?

      Hi, Can you please tell me how can I write a table (like row col as in excel) in a word doc. I have all the values with me. Table will be like below: ------------ Param|value ----------- ID |abc name |xyz Job |lmn ---------- Thanks Sanjoy
Re: Word tables, OLE, perl, and me.
by Zaphod137 (Novice) on Feb 09, 2009 at 19:05 UTC
    Thanks for all the help folks. Sorry for no reply for a while, I had jury duty on a criminal case basically all last week....

    I did get it sorted out, and I did implement a combination of suggestions. I'm using the *Home, *End for selection, and am iteratively moving backwards line by line to find the word I'm looking for. It seems to work out well. It also became useful when another document had a slightly different format w/ multiple tables for a single Table1 heading. Of course, I'm going to have to do this for multiple documents now with differing formats... yay.. Thanks for all the help! example code for those interested:
    while ($tablefound == 0){ $word->Selection->MoveUp(wdLine,1); $word->Selection->HomeKey(wdLine); $word->Selection->EndKey(wdLine, wdExtend); my $str = $word->Selection->{Text}; if ($str =~ m/Table (\d+)/){ $tableID = $1; $tablefound = 1; } }
      I Am using your method to get the Data from MSWROD Table. But am facing same problem, Excel cells are merged for WrapText if i export MSWORD Table into an Excel file. I think you have done some thing to get it as UnMerged. Could you please post that part of Code. If possible can you give your mail Id also. Mine Kowshik.S@KPITCummins.com