Do you have Word? If so, it may be possible to use Word's alleged ability to read HTML files to get what you want. I say "alleged", because Word will only read HTML files of a certain format. I don't speak HTML, so I haven't managed to get any code working to do this, but the automation of Word is not that difficult. I opened a Word instance and saved a blank document as HTML. This generated most of the code below, which is nearly working, i.e. it doesn't work. The problem seems (remember, I don't speak HTML) to have something to do with there being head and body tags from both the existing HTML document and the word top and tail. The temp file created therefore gets rejected by Word when it tries to open it. If anyone knows enough about HTML to get an HTML file into what Word will accept, this might be a way forward for you - if you have Word!
Regards,
John
use strict; use warnings; use Win32::OLE; use Win32::OLE::Const 'Microsoft Word'; my $htmltop = "<html xmlns:o=\"urn:schemas-microsoft-com:office:office +\" xmlns:w=\"urn:schemas-microsoft-com:office:word\" xmlns=\"http://www.w3.org/TR/REC-html40\"> <head> <meta http-equiv=Content-Type content=\"text/html; charset=windows-125 +2\"> <meta name=ProgId content=Word.Document> <meta name=Generator content=\"Microsoft Word 10\"> <meta name=Originator content=\"Microsoft Word 10\"> <link rel=File-List href=\"Blank_files/filelist.xml\"> <!--[if gte mso 9]><xml> <o:DocumentProperties> <o:Author>Davies</o:Author> <o:LastAuthor>Davies</o:LastAuthor> <o:Revision>1</o:Revision> <o:TotalTime>1</o:TotalTime> <o:Created>2011-02-01T14:47:00Z</o:Created> <o:LastSaved>2011-02-01T14:48:00Z</o:LastSaved> <o:Pages>1</o:Pages> <o:Lines>1</o:Lines> <o:Paragraphs>1</o:Paragraphs> <o:Version>10.2625</o:Version> </o:DocumentProperties> </xml><![endif]--><!--[if gte mso 9]><xml> <w:WordDocument> <w:Compatibility> <w:BreakWrappedTables/> <w:SnapToGridInCell/> <w:WrapTextWithPunct/> <w:UseAsianBreakRules/> </w:Compatibility> <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel> </w:WordDocument> </xml><![endif]--> <style> <!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:\"\"; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Arial; mso-fareast-font-family:\"Times New Roman\"; mso-bidi-font-family:\"Times New Roman\";} \@page Section1 {size:595.3pt 841.9pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:35.4pt; mso-footer-margin:35.4pt; mso-paper-source:0;} div.Section1 {page:Section1;} --> </style> <!--[if gte mso 10]> <style> /* Style Definitions */ table.MsoNormalTable {mso-style-name:\"Table Normal\"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:\"\"; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:\"Times New Roman\";} </style> <![endif]--> </head> <body lang=EN-GB style='tab-interval:36.0pt'>"; my $htmltail = "</body> </html>"; my $infile = shift; my $tempfile = $infile; $tempfile =~ s/\./tmp\./; my $outfile = $infile; $outfile =~ s/.html?/.doc/; my $fhi; my $fht; open($fhi, "<", $infile) or die "Can't open input file"; open($fht, ">", $tempfile) or die "Can't open temp file"; print {$fht} $htmltop; while (my $line = <$fhi>) { print {$fht} $line; } print {$fht} $htmltail; close $fhi; close $fht; my $word = Win32::OLE->new('Word.Application'); my $doc = $word->Documents->Open($tempfile) or die "Dying $!"; $doc->SaveAs({FileName => $outfile, FileFormat => wdFormatDocument}); $doc->close(); $word->Quit();
In reply to Re: Approaches to produce word docs
by davies
in thread Approaches to produce word docs
by LanX
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |