in reply to MS Word conversion

The conversion from M$ Word format to text, unless it's to RTF (Rich Text Format), will lose all embedded formatting information thus the answer to your 1st question, as others have pointed out, is no.

As for the 2nd question, I've always used Win32::OLE to automate tasks on M$ documents in their native format - assuming the M$ application supports OLE (some don't).

A user level that continues to overstate my experience :-))

Replies are listed 'Best First'.
Re^2: MS Word conversion
by gibsonca (Beadle) on Oct 08, 2009 at 17:37 UTC

    The examples on the web for modifying an existing MS Word (2003 for me) either do little, or they error out. May I have an example where something is searched for in the document, and a sub paragraph or section added? Thanks for your time!

      Word 2003 has a "new" default file format, typically indicated by a filename extension of .docx instead of .doc. The docx format is a ZIP file containing ugly, generated XML and some other stuff. So, in theory, you just unzip the docx file, modify the XML, and zip it back to a new docx file. I think Microsoft has a description of the XML format they use, burried somewhere on their website.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)