Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I made an attempt to use HTML::TableExtract module and i am getting the expected result.
But problem is that , i couldn't maintain the structure.

I would like to display the table as format as in the html (selected column).

I am attaching my html code...

<html> <head> <title>Calculate</title> </head> <body bgcolor="#FFFFFF" text="#000000"> <table border="0" width="100%" cellspacing="0" cellpadding="0"> <tr> <td width="84" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Account </strong></font></td> </font><td width="65" bgcolor="#000000"><font face="Arial Narrow" +color="#FFFFFF"><strong>Date</strong></font></td> <td width="65" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Start</strong></font></td> <td width="65" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>End</strong></font></td> <td width="107" bgcolor="#000000"><strong><font face="Arial Narrow +" color="#FFFFFF">Description</font></strong></td> <td width="119" bgcolor="#000000"><font face="Arial Narrow" color= +"#FFFFFF"><strong>Detail</strong></font></td> <td width="55" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Qty.</strong></font></td> <td width="55" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Actual</strong></font></td> <td width="46" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Rate</strong></font></td> <td width="86" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Amount</strong></font></td> </tr> <tr> <td width="84" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>162456</font></small> </td> <td width="65" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>7/1/03</font></small> </td> <td width="65" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>18:27:35</font></small> </td> <td width="65" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>18:37:35</font></small> </td> <td width="107" bgcolor="#C0C0C0"><font face="Arial Narrow"><smal +l>FROM 1001</small> </font></td> <td width="119" bgcolor="#C0C0C0"><font face="Arial Narrow"><smal +l>UNITED STATES 3217150214</small> </font></td> <td width="55" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>10.00</font></small> </td> <td width="55" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>10.00</font></small> </td> <td width="46" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>0.10</font></small> </td> <td width="86" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>$1.00</font></small> </td> </tr> <tr> <td width="84" ><small><font face="Arial Narrow">162456</font></sm +all> </td> <td width="65" ><small><font face="Arial Narrow">7/2/03</font></sm +all> </td> <td width="65" ><small><font face="Arial Narrow">02:54:56</font></ +small> </td> <td width="65" ><small><font face="Arial Narrow">02:57:38</font></ +small> </td> <td width="107" ><font face="Arial Narrow"><small>FROM 1001</small +> </font></td> <td width="119" ><font face="Arial Narrow"><small>UNITED STATES 52 +17733259</small> </font></td> <td width="55" ><small><font face="Arial Narrow">3.00</font></smal +l> </td> <td width="55" ><small><font face="Arial Narrow">2.70</font></smal +l> </td> <td width="46" ><small><font face="Arial Narrow">0.10</font></smal +l> </td> <td width="86" ><small><font face="Arial Narrow">$0.30</font></sma +ll> </td> </tr> <tr> <td width="84" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>162456</font></small> </td> <td width="65" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>7/2/03</font></small> </td> <td width="65" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>03:02:57</font></small> </td> <td width="65" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>03:05:21</font></small> </td> <td width="107" bgcolor="#C0C0C0"><font face="Arial Narrow"><smal +l>FROM 1001</small> </font></td> <td width="119" bgcolor="#C0C0C0"><font face="Arial Narrow"><smal +l>UNITED STATES 4633895267</small> </font></td> <td width="55" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>3.00</font></small> </td> <td width="55" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>2.40</font></small> </td> <td width="46" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>0.10</font></small> </td> <td width="86" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>$0.30</font></small> </td> </tr> <tr> <td width="84" ><small><font face="Arial Narrow">162456</font></sm +all> </td> <td width="65" ><small><font face="Arial Narrow">7/2/03</font></sm +all> </td> <td width="65" ><small><font face="Arial Narrow">03:13:24</font></ +small> </td> <td width="65" ><small><font face="Arial Narrow">03:14:24</font></ +small> </td> <td width="107" ><font face="Arial Narrow"><small>FROM 1001</small +> </font></td> <td width="119" ><font face="Arial Narrow"><small>UNITED STATES 60 +958767236</small> </font></td> <td width="55" ><small><font face="Arial Narrow">1.00</font></smal +l> </td> <td width="55" ><small><font face="Arial Narrow">1.00</font></smal +l> </td> <td width="46" ><small><font face="Arial Narrow">0.10</font></smal +l> </td> <td width="86" ><small><font face="Arial Narrow">$0.10</font></sma +ll> </td> </tr> <tr> <td width="84" bgcolor="#C0C0C0"><strong><small><font face="Arial + Narrow">&nbsp;</font></small></strong> </td> <td width="65" bgcolor="#C0C0C0"><strong><small><font face="Arial + Narrow">&nbsp;</font></small></strong> </td> <td width="65" bgcolor="#C0C0C0"><strong><small><font face="Arial + Narrow">Total</font></small></strong> </td> <td width="65" bgcolor="#C0C0C0"></td> <td width="107" bgcolor="#C0C0C0"><font face="Arial Narrow"><stro +ng><small>Calls: 4</small></strong></font></td> <td width="119" bgcolor="#C0C0C0"></td> <td width="55" bgcolor="#C0C0C0"><strong><small><font face="Arial + Narrow">209.00</font></small></strong> </td> <td width="55" bgcolor="#C0C0C0"><strong><small><font face="Arial + Narrow">182.30</font></small></strong> </td> <td width="46" bgcolor="#C0C0C0"></td> <td width="86" bgcolor="#C0C0C0"><strong><small><font face="Arial + Narrow">$21.44</font></small></strong> </td> </tr> <tr> <td width="84" ><strong><font face="Arial Narrow" color="#000000"> +<small>&nbsp;</small> </font></strong></td> <td width="65" ><strong><font face="Arial Narrow" color="#000000"> +<small>&nbsp;</small> </font></strong></td> <td width="65" ><strong><font face="Arial Narrow" color="#000000"> +<small>Total</small></font></strong></td> <td width="65" ></td> <td width="107" <font face="Arial Narrow"><strong><font face="Arial Narrow" color="#000000"><small>Credits: 0</small> </fo +nt></strong></td> <td width="119" <font face="Arial Narrow"></td> <td width="55" ></td> <td width="55" ></td> <td width="46" ></td> <td width="86" ><strong><font face="Arial Narrow" color="#000000"> +<small>$0.00</small> </font></strong></td> </tr> </table> <p align="center"><a href="acctaccount.asp"><small><font face="Arial N +arrow">Return to Edit By Account</font></small></a></p> </body> </html>
and perl code written as follows
#!/usr/bin/perl use strict; my $row; use HTML::TableExtract; $/ = ""; print "Content-type: text/html\n\n"; open(RIGHT, "/usr/fweb/mydomain/html/test.html"); my $string = <RIGHT>; close(RIGHT); my $headers = ['Account','Detail','Amount']; my $te = HTML::TableExtract->new(headers => $headers) ; $te->parse($string); foreach my $ts ($te->table_states) { print "Column1 &nbsp;&nbsp;&nbsp; Column2 &nbsp;&nbsp;&nbsp; Column3 +\<br>"; foreach $row ($ts->rows) { print join('&nbsp;&nbsp;&nbsp;', @$row), "\<br>"; } }
Please modify the above code to get proper table display.
Thanks in advance

update (broquaint): added <readmore> tags and fixed formatting

Replies are listed 'Best First'.
Re: How to catch table format display from actual html using TableExtract?
by graff (Chancellor) on Jul 10, 2003 at 08:16 UTC
    I'm not sure if I understand the nature of your problem. You say:

    i am getting the expected result. But problem is that , i couldn't maintain the structure. I would like to display the table as format as in the html (selected column).

    Do you mean that you are not getting just the three particular columns you want in your output ('Account, Detail, Amount')? If that's the case, then maybe the first row of the HTML table should be using  <th> tags instead of  <td> tags -- I'm guessing that HTML::TableExtract may be relying on the markup to be that way, so it knows where to find the "header" labels in the html table structure.

    If that's not the issue, then you'll need to explain your problem more carefully. Just what sort of output are you getting, and how does it differ from your intention, exactly?

      Hi
      Thanks for your reply.
      I would like get the final output as ...
      <table border="0" width="100%" cellspacing="0" cellpadding="0"> <tr> <td width="84" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Account </strong></font></td> <td width="119" bgcolor="#000000"><font face="Arial Narrow" color= +"#FFFFFF"><strong>Detail</strong></font></td> <td width="86" bgcolor="#000000"><font face="Arial Narrow" color=" +#FFFFFF"><strong>Amount</strong></font></td> </tr> <tr> <td width="84" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>162456</font></small> </td> <td width="119" bgcolor="#C0C0C0"><font face="Arial Narrow"><smal +l>UNITED STATES 3217150214</small> </font></td> <td width="86" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>$1.00</font></small> </td> </tr> <tr> <td width="84" ><small><font face="Arial Narrow">162456</font></sm +all> </td> <td width="119" ><font face="Arial Narrow"><small>UNITED STATES 52 +17733259</small> </font></td> <td width="86" ><small><font face="Arial Narrow">$0.30</font></sma +ll> </td> </tr> <tr> <td width="84" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>162456</font></small> </td> <td width="119" bgcolor="#C0C0C0"><font face="Arial Narrow"><smal +l>UNITED STATES 4633895267</small> </font></td> <td width="86" bgcolor="#C0C0C0"><small><font face="Arial Narrow" +>$0.30</font></small> </td> </tr> <tr> <td width="84" ><small><font face="Arial Narrow">162456</font></sm +all> </td> <td width="119" ><font face="Arial Narrow"><small>UNITED STATES 60 +958767236</small> </font></td> <td width="86" ><small><font face="Arial Narrow">$0.10</font></sma +ll> </td> </tr> </table>
      Thanks
        I would like get the final output as ...

        Okay... so, uhm... what's stopping you from getting that, exactly? What are you getting instead? If you are getting what you want from the TableExtract module, the rest is just a matter of printing stuff out.

        You might want to look into something like HTML::Template, to have the page/table layout and all those font attributes in a separate html text file (not embedded in your perl code), and the perl script would just populate that template with the actual data values. (Or, if this is just a one-shot deal, just set up a nested loop: outer loop prints the table rows, inner loop prints the cells on each row. That way, at least you only have to state all those attributes once.)