in reply to How to generate HTML from Word document ?

How complex is your Word document? Is it simple text with styling or complex page layout with tables and embedded graphics? Can the HTML output from Word be used for your purpose? The HTML created by Word is a pig and very bloated.

Can you convert the Word doc into RTF and then parse that? That would give you simple styling queues to then convert into HTML. If the Word document its complex or the HTML output from Word works for you final output, you may want to try Win32::OLE to remote control Word to create the HTML for you.

  • Comment on Re: How to generate HTML from Word document ?

Replies are listed 'Best First'.
Re^2: How to generate HTML from Word document ?
by marto (Cardinal) on May 30, 2013 at 09:34 UTC

    "...you may want to try Win32::OLE to remote control Word to create the HTML for you."

    This isn't an option, they're running Linux.

        Using Wine may work, but without a lot of investigation I couldn't recomend it. IMHO it muddies the waters in that you may go chasing a problem with running MS Office, Perl and all the required modules under Wine. That said, it's a fair point to raise, but would require much time to investigate.