Hi Monks,

I'm an absolute newbie to Perl and I'm trying to print the contents of a table (in a docx) into a tab-delimited or xlsx file, either is fine.

There are over 100 docx's that have tables in them with identical delimiters ("Version History", "Table of Contents").

I can print the entire contents of the file to a .txt, but I can't yet get the data between the delimiters.

Any suggestions?

Thank you!



An update with all of the code I'm currently using:
use strict; use warnings; use Win32::OLE qw(in); use Win32::OLE::Const 'Microsoft Word'; use Win32::OLE::Variant; $|=1; sub Parse{ my $document_name = 'C:\TestPolicy.rtf'; my $word = Win32::OLE->GetActiveObject('Word.Application') || Win32::OLE->new('Word.Application','Quit') or die Win32::OLE->LastError(); my $document = $word->Documents->Open($document_name) or die Win32::OLE->LastError(); my $paragraphs = $document->Paragraphs (); my $n_paragraphs = $paragraphs->Count (); my $outputfile = 'C:\testfile.txt'; open(INPUT, $document_name) or die "Failed to open $document_name\n"; while (<INPUT>){ if ($_ =~ /HISTORY/../TABLE/){ open(OUTPUT, '>'.$outputfile) or die "Can't create $output +file.\n"; print OUTPUT "$_\n"; close OUTPUT; } } close INPUT; } Parse()

In reply to Getting lines in a file between two patterns by Daikini

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.