dirtdog has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

I'm creating an XLS spreadsheet on a Linux server using Spreadsheet::WriteExcel. It has 12 worksheets and each sheet has different color combinations and URL hyperlinks.

Now I need to convert it to HTML to display it on a webpage. I can do this manually without issue, but when I tried to use Spreadsheet::ParseExcel to automate the process (borrowed from an old monk post) I ended up with an extremely basic display

my $excel_obj = Spreadsheet::ParseExcel->new(); my $workbook = $excel_obj->Parse($outfile); die "Workbook did not return worksheets!\n" unless ref $workbook->{Worksheet} eq 'ARRAY'; for my $worksheet ( @{$workbook->{Worksheet}} ) { print OUT_HTML_FILE "<table>\n"; for my $row ( 0 .. $worksheet->{MaxRow} ) { print OUT_HTML_FILE "<tr>\n"; for my $col ( 0 .. $worksheet->{MaxCol} ) { my $cell = $worksheet->{Cells}[$row][$col]; print OUT_HTML_FILE "<td>"; print OUT_HTML_FILE ref $cell ? $cell->Value : ''; print OUT_HTML_FILE "</td>\n"; } print OUT_HTML_FILE "</tr>\n"; # record ends } print OUT_HTML_FILE "</table>\n"; # worksheet ends }

When I look at the files that Excel produces when I use the "Save As" to an HTML file. It produces a main .htm file plus a link to 12 separate htm files for each worksheet. Within each sheet the HTML seems to be somewhat complex. Can you please let me know if Spreadsheet::ParseExcel would be a good option to try and produce the same .htm files that are produced from the "File Save As" option within Excel?

Thank you very much for your time

Replies are listed 'Best First'.
Re: Converting XLS to HTML
by misc (Friar) on Jul 25, 2017 at 08:20 UTC

    Is there a reason, that you are creating a excel file first?

    Most probably it would be better to write the html directly, using e.g. HTML::table, or Template

    This way you have also much more control about how the result will look like

Re: Converting XLS to HTML
by ww (Archbishop) on Jul 24, 2017 at 20:39 UTC
    "Within each sheet the HTML seems to be somewhat complex."

    Are you talking about the verbose, nonsensical, proprietary output that MS labels "HTML?"

    And what is the "extremely basic display" to which you refer? We'd likely be able to make somewhat more sensible answers to your final question (similarity between excel export and S:PE output) if we knew what you're getting now... a difficulty you could cure by adding a snippet of the output to your post (and, NJBTW, a snippet at least from a couple excel sheets).

    With luck, you'll get answers to your actual questions here, even without the information I'm asking you to provide... but best to remember: better questions get better answers.

    You'll probably find On asking for help and How do I post a question effectively? helpful.


    $anecdote ne $data

    Questions containing the words "doesn't work" (or their moral equivalent) will usually get a downvote from me unless accompanied by:
    1. code
    2. verbatim error and/or warning messages
    3. a coherent explanation of what "doesn't work actually means.

      Here would be an example of the main.htm code that was produced by MS Excel:

      <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name="Excel Workbook Frameset"> <meta http-equiv=Content-Type content="text/html; charset=windows-1252 +"> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 14"> <link rel=File-List href="env_files/filelist.xml"> <![if !supportTabStrip]> <link id="shLink" href="env_files/sheet001.htm"> <link id="shLink" href="env_files/sheet002.htm"> <link id="shLink" href="env_files/sheet003.htm"> <link id="shLink" href="env_files/sheet004.htm"> <link id="shLink" href="env_files/sheet005.htm"> <link id="shLink" href="env_files/sheet006.htm"> <link id="shLink" href="env_files/sheet007.htm"> <link id="shLink" href="env_files/sheet008.htm"> <link id="shLink" href="env_files/sheet009.htm"> <link id="shLink" href="env_files/sheet010.htm"> <link id="shLink" href="env_files/sheet011.htm"> <link id="shLink" href="env_files/sheet012.htm"> <link id="shLink"> <script language="JavaScript"> <!-- var c_lTabs=12; var c_rgszSh=new Array(c_lTabs); c_rgszSh[0] = "Patriots"; c_rgszSh[1] = "Giants"; c_rgszSh[2] = "Steelers"; c_rgszSh[3] = "Redskins"; c_rgszSh[4] = "Broncos"; c_rgszSh[5] = "Oilers"; c_rgszSh[6] = "Browns"; c_rgszSh[7] = "Raiders"; c_rgszSh[8] = "Vikings"; c_rgszSh[9] = "Falcons"; c_rgszSh[10] = "Cowboys"; c_rgszSh[11] = "Jets"; var c_rgszClr=new Array(8); c_rgszClr[0]="window"; c_rgszClr[1]="buttonface"; c_rgszClr[2]="windowframe"; c_rgszClr[3]="windowtext"; c_rgszClr[4]="threedlightshadow"; c_rgszClr[5]="threedhighlight"; c_rgszClr[6]="threeddarkshadow"; c_rgszClr[7]="threedshadow"; var g_iShCur; var g_rglTabX=new Array(c_lTabs); function fnGetIEVer() { var ua=window.navigator.userAgent var msie=ua.indexOf("MSIE") if (msie>0 && window.navigator.platform=="Win32") return parseInt(ua.substring(msie+5,ua.indexOf(".", msie))); else return 0; } function fnBuildFrameset() { var szHTML="<frameset rows=\"*,18\" border=0 width=0 frameborder=no f +ramespacing=0>"+ "<frame src=\""+document.all.item("shLink")[0].href+"\" name=\"frShe +et\" noresize>"+ "<frameset cols=\"54,*\" border=0 width=0 frameborder=no framespacin +g=0>"+ "<frame src=\"\" name=\"frScroll\" marginwidth=0 marginheight=0 scro +lling=no>"+ "<frame src=\"\" name=\"frTabs\" marginwidth=0 marginheight=0 scroll +ing=no>"+ "</frameset></frameset><plaintext>"; with (document) { open("text/html","replace"); write(szHTML); close(); } fnBuildTabStrip(); } function fnBuildTabStrip() { var szHTML= "<html><head><style>.clScroll {font:8pt Courier New;color:"+c_rgszCl +r[6]+";cursor:default;line-height:10pt;}"+ ".clScroll2 {font:10pt Arial;color:"+c_rgszClr[6]+";cursor:default;l +ine-height:11pt;}</style></head>"+ "<body onclick=\"event.returnValue=false;\" ondragstart=\"event.retu +rnValue=false;\" onselectstart=\"event.returnValue=false;\" bgcolor=" ++c_rgszClr[4]+" topmargin=0 leftmargin=0><table cellpadding=0 cellspa +cing=0 width=100%>"+ "<tr><td colspan=6 height=1 bgcolor="+c_rgszClr[2]+"></td></tr>"+ "<tr><td style=\"font:1pt\">&nbsp;<td>"+ "<td valign=top id=tdScroll class=\"clScroll\" onclick=\"parent.fnFa +stScrollTabs(0);\" onmouseover=\"parent.fnMouseOverScroll(0);\" onmou +seout=\"parent.fnMouseOutScroll(0);\"><a>&#171;</a></td>"+ "<td valign=top id=tdScroll class=\"clScroll2\" onclick=\"parent.fnS +crollTabs(0);\" ondblclick=\"parent.fnScrollTabs(0);\" onmouseover=\" +parent.fnMouseOverScroll(1);\" onmouseout=\"parent.fnMouseOutScroll(1 +);\"><a>&lt</a></td>"+ "<td valign=top id=tdScroll class=\"clScroll2\" onclick=\"parent.fnS +crollTabs(1);\" ondblclick=\"parent.fnScrollTabs(1);\" onmouseover=\" +parent.fnMouseOverScroll(2);\" onmouseout=\"parent.fnMouseOutScroll(2 +);\"><a>&gt</a></td>"+ "<td valign=top id=tdScroll class=\"clScroll\" onclick=\"parent.fnFa +stScrollTabs(1);\" onmouseover=\"parent.fnMouseOverScroll(3);\" onmou +seout=\"parent.fnMouseOutScroll(3);\"><a>&#187;</a></td>"+ "<td style=\"font:1pt\">&nbsp;<td></tr></table></body></html>"; with (frames['frScroll'].document) { open("text/html","replace"); write(szHTML); close(); } szHTML = "<html><head>"+ "<style>A:link,A:visited,A:active {text-decoration:none;"+"color:"+c +_rgszClr[3]+";}"+ ".clTab {cursor:hand;background:"+c_rgszClr[1]+";font:9pt Arial;padd +ing-left:3px;padding-right:3px;text-align:center;}"+ ".clBorder {background:"+c_rgszClr[2]+";font:1pt;}"+ "</style></head><body onload=\"parent.fnInit();\" onselectstart=\"ev +ent.returnValue=false;\" ondragstart=\"event.returnValue=false;\" bgc +olor="+c_rgszClr[4]+ " topmargin=0 leftmargin=0><table id=tbTabs cellpadding=0 cellspacin +g=0>"; var iCellCount=(c_lTabs+1)*2; var i; for (i=0;i<iCellCount;i+=2) szHTML+="<col width=1><col>"; var iRow; for (iRow=0;iRow<6;iRow++) { szHTML+="<tr>"; if (iRow==5) szHTML+="<td colspan="+iCellCount+"></td>"; else { if (iRow==0) { for(i=0;i<iCellCount;i++) szHTML+="<td height=1 class=\"clBorder\"></td>"; } else if (iRow==1) { for(i=0;i<c_lTabs;i++) { szHTML+="<td height=1 nowrap class=\"clBorder\">&nbsp;</td>"; szHTML+= "<td id=tdTab height=1 nowrap class=\"clTab\" onmouseover=\"pare +nt.fnMouseOverTab("+i+");\" onmouseout=\"parent.fnMouseOutTab("+i+"); +\">"+ "<a href=\""+document.all.item("shLink")[i].href+"\" target=\"fr +Sheet\" id=aTab>&nbsp;"+c_rgszSh[i]+"&nbsp;</a></td>"; } szHTML+="<td id=tdTab height=1 nowrap class=\"clBorder\"><a id=aTa +b>&nbsp;</a></td><td width=100%></td>"; } else if (iRow==2) { for (i=0;i<c_lTabs;i++) szHTML+="<td height=1></td><td height=1 class=\"clBorder\"></td>" +; szHTML+="<td height=1></td><td height=1></td>"; } else if (iRow==3) { for (i=0;i<iCellCount;i++) szHTML+="<td height=1></td>"; } else if (iRow==4) { for (i=0;i<c_lTabs;i++) szHTML+="<td height=1 width=1></td><td height=1></td>"; szHTML+="<td height=1 width=1></td><td></td>"; } } szHTML+="</tr>"; } szHTML+="</table></body></html>"; with (frames['frTabs'].document) { open("text/html","replace"); charset=document.charset; write(szHTML); close(); } } function fnInit() { g_rglTabX[0]=0; var i; for (i=1;i<=c_lTabs;i++) with (frames['frTabs'].document.all.tbTabs.rows[1].cells[fnTabToCol( +i-1)]) g_rglTabX[i]=offsetLeft+offsetWidth-6; } function fnTabToCol(iTab) { return 2*iTab+1; } function fnNextTab(fDir) { var iNextTab=-1; var i; with (frames['frTabs'].document.body) { if (fDir==0) { if (scrollLeft>0) { for (i=0;i<c_lTabs&&g_rglTabX[i]<scrollLeft;i++); if (i<c_lTabs) iNextTab=i-1; } } else { if (g_rglTabX[c_lTabs]+6>offsetWidth+scrollLeft) { for (i=0;i<c_lTabs&&g_rglTabX[i]<=scrollLeft;i++); if (i<c_lTabs) iNextTab=i; } } } return iNextTab; } function fnScrollTabs(fDir) { var iNextTab=fnNextTab(fDir); if (iNextTab>=0) { frames['frTabs'].scroll(g_rglTabX[iNextTab],0); return true; } else return false; } function fnFastScrollTabs(fDir) { if (c_lTabs>16) frames['frTabs'].scroll(g_rglTabX[fDir?c_lTabs-1:0],0); else if (fnScrollTabs(fDir)>0) window.setTimeout("fnFastScrollTabs("+fDir ++");",5); } function fnSetTabProps(iTab,fActive) { var iCol=fnTabToCol(iTab); var i; if (iTab>=0) { with (frames['frTabs'].document.all) { with (tbTabs) { for (i=0;i<=4;i++) { with (rows[i]) { if (i==0) cells[iCol].style.background=c_rgszClr[fActive?0:2]; else if (i>0 && i<4) { if (fActive) { cells[iCol-1].style.background=c_rgszClr[2]; cells[iCol].style.background=c_rgszClr[0]; cells[iCol+1].style.background=c_rgszClr[2]; } else { if (i==1) { cells[iCol-1].style.background=c_rgszClr[2]; cells[iCol].style.background=c_rgszClr[1]; cells[iCol+1].style.background=c_rgszClr[2]; } else { cells[iCol-1].style.background=c_rgszClr[4]; cells[iCol].style.background=c_rgszClr[(i==2)?2:4]; cells[iCol+1].style.background=c_rgszClr[4]; } } } else cells[iCol].style.background=c_rgszClr[fActive?2:4]; } } } with (aTab[iTab].style) { cursor=(fActive?"default":"hand"); color=c_rgszClr[3]; } } } } function fnMouseOverScroll(iCtl) { frames['frScroll'].document.all.tdScroll[iCtl].style.color=c_rgszClr[ +7]; } function fnMouseOutScroll(iCtl) { frames['frScroll'].document.all.tdScroll[iCtl].style.color=c_rgszClr[ +6]; } function fnMouseOverTab(iTab) { if (iTab!=g_iShCur) { var iCol=fnTabToCol(iTab); with (frames['frTabs'].document.all) { tdTab[iTab].style.background=c_rgszClr[5]; } } } function fnMouseOutTab(iTab) { if (iTab>=0) { var elFrom=frames['frTabs'].event.srcElement; var elTo=frames['frTabs'].event.toElement; if ((!elTo) || (elFrom.tagName==elTo.tagName) || (elTo.tagName=="A" && elTo.parentElement!=elFrom) || (elFrom.tagName=="A" && elFrom.parentElement!=elTo)) { if (iTab!=g_iShCur) { with (frames['frTabs'].document.all) { tdTab[iTab].style.background=c_rgszClr[1]; } } } } } function fnSetActiveSheet(iSh) { if (iSh!=g_iShCur) { fnSetTabProps(g_iShCur,false); fnSetTabProps(iSh,true); g_iShCur=iSh; } } window.g_iIEVer=fnGetIEVer(); if (window.g_iIEVer>=4) fnBuildFrameset(); //--> </script> <![endif]><!--[if gte mso 9]><xml> <x:ExcelWorkbook> <x:ExcelWorksheets> <x:ExcelWorksheet> <x:Name>Patriots</x:Name> <x:WorksheetSource HRef="env_files/sheet001.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Giants</x:Name> <x:WorksheetSource HRef="env_files/sheet002.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Steelers</x:Name> <x:WorksheetSource HRef="env_files/sheet003.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Redskins</x:Name> <x:WorksheetSource HRef="env_files/sheet004.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Broncos</x:Name> <x:WorksheetSource HRef="env_files/sheet005.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Oilers</x:Name> <x:WorksheetSource HRef="env_files/sheet006.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Browns</x:Name> <x:WorksheetSource HRef="env_files/sheet007.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Raiders</x:Name> <x:WorksheetSource HRef="env_files/sheet008.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Vikings</x:Name> <x:WorksheetSource HRef="env_files/sheet009.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Falcons</x:Name> <x:WorksheetSource HRef="env_files/sheet010.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Cowboys</x:Name> <x:WorksheetSource HRef="env_files/sheet011.htm"/> </x:ExcelWorksheet> <x:ExcelWorksheet> <x:Name>Jets</x:Name> <x:WorksheetSource HRef="env_files/sheet012.htm"/> </x:ExcelWorksheet> </x:ExcelWorksheets> <x:Stylesheet HRef="env_files/stylesheet.css"/> <x:WindowHeight>12525</x:WindowHeight> <x:WindowWidth>13665</x:WindowWidth> <x:WindowTopX>0</x:WindowTopX> <x:WindowTopY>0</x:WindowTopY> <x:ProtectStructure>False</x:ProtectStructure> <x:ProtectWindows>False</x:ProtectWindows> </x:ExcelWorkbook> </xml><![endif]--> </head> <frameset rows="*,39" border=0 width=0 frameborder=no framespacing=0> <frame src="env_files/sheet001.htm" name="frSheet"> <frame src="env_files/tabstrip.htm" name="frTabs" marginwidth=0 margi +nheight=0> <noframes> <body> <p>This page uses frames, but your browser doesn't support them.</p +> </body> </noframes> </frameset> </html>

      And here is a sample of one of the 12 worksheet files produced

      <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta http-equiv=Content-Type content="text/html; charset=windows-1252 +"> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 14"> <link id=Main-File rel=Main-File href="../env.htm"> <link rel=File-List href=filelist.xml> <link rel=Stylesheet href=stylesheet.css> <style> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {mso-header-data:""; mso-footer-data:""; margin:1.0in .75in 1.0in .75in; mso-header-margin:.5in; mso-footer-margin:.5in;} --> </style> <![if !supportTabStrip]><script language="JavaScript"> <!-- function fnUpdateTabs() { if (parent.window.g_iIEVer>=4) { if (parent.document.readyState=="complete" && parent.frames['frTabs'].document.readyState=="complete") parent.fnSetActiveSheet(11); else window.setTimeout("fnUpdateTabs();",150); } } if (window.name!="frSheet") window.location.replace("../env.htm"); else fnUpdateTabs(); //--> </script> <![endif]> </head>
      By saying "we" while admonishing people, you sound like a royal ass, its just you guy

        Hi Monks, not sure if this question got lost over time, but was hoping to get it back on the radar

        Any advice would be greatly appreciated

        Also, I'd like to say thanks again to all the people that give their time to help answer questions. you guys are great!