I am in the process of generating Word documents using Perl as well, I didn't think about Office HTML, but about 5 years ago I made some RTF documents from a shell script, so I started from scratch and tried that again from Perl, which has turned out to be very easy...

I reverse engineered an RTF document by saving a basic one, then stripping out tags using a text editor till I had the bare bones of what I needed and nothing else. (and it would still open without crashing Word!) This is what I found works as a basis:

http://ref.a32.net/technical/file_format/rtf/basic_table_rtf_source.txt

Although I expect it's more flexible and robust to use horrible Office HTML as people have pointed out - if you can handle that

You say you have "Word Template fields" - from my memories of my days of working with that stuff - you use that for doing a mail-merge with a master (now called "main"?) document right?

If that's the case then all you should have to do is export to some database-like format (possibly Excel compatible HTML, or even just CSV - TOO easy!) and then do a mail merge... pulling the data from that document/file into Word...

I've just looked in the help in Word where it says:

"What types of data sources can I use? You can use just about any type of data source that you want, including a Word table, Microsoft Outlook contact list, Excel worksheet, Microsoft Access database, or ASCII text file. If you haven't already stored information in a data source, Word guides you step by step through setting up a Word table that contains your names, addresses, and other data."

Try this Office help link in I.E. if you're using Word 2000:
mk:@MSITStore:C:\Program%20Files\Microsoft%20Office\Office\1033\wdmain9.chm::/html/wdconOverviewOfMailMerge.htm

Or just crank up Macro$haft Wurd Help and type in "mail merge" :o)

IHTH...


In reply to Re: parse MS Word Template fields for legal documents by serf
in thread parse MS Word Template fields for legal documents by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.