jaydon has asked for the wisdom of the Perl Monks concerning the following question:
A Good morning to everyone.
I'm getting strange results from the TableExtract module as it only gets colums 2 and 3 out of 10 for the 1st row, but seems to be working fine for the rest of the table.
I modified the html source code to remove some formatting and modified one column header to eliminate duplicates. Directly below is the original source code with the removed lines marked, and there after I have pasted the modified html:
Original:
<table width="100%" border="0" cellpadding="0" cellspacing="0" id="t +itleTable"><tr><th>(US$/Barrel) </th></tr></table> <table width="100%" border="0" cellpadding="0"cellspacing="0" id="data +Table" > <tr> <td rowspan="2" align="left"> <strong>Month </str +ong></td> <td rowspan="2" align="right"><strong>First</strong></td> <td rowspan="2" align="right"><strong>High</strong></td> <td rowspan="2" align="right"><strong>Low</strong></td> <td rowspan="2" align="right"><strong>Sett</strong></td> <td rowspan="2" align="right"><strong>Chg</strong></td> <td rowspan="2" align="right"><strong>Vol</strong></td> <td rowspan="2" align="right"><strong>BWAVE*</strong></td> <td width="1" align="center" style=""><img src="/ice/images/sh +im.gif" width="1" height="1"></td> <- REMOVED THIS LINE <td colspan="2" align="center" style=""><span style="font-weig +ht: bold">Prev. Business <- REMOVED THIS LINE Day</span></td> <- REMOVED THIS LINE <tr valign="top"> <- REMOVED THIS LINE <td width="1" align="center" bgcolor="#FFFFFF" style=""> </ +td> <td align="right"><strong>Vol</strong></td> <- MODIFIED TEXT <td align="right" style=""><strong>Open Int</strong></td> </tr> <tr valign="top" class="qlBgColor"> <td>Dec-05 </td> <td align="right">5525</td> <td align="right">5572</td> <td align="right">5450</td> <td align="right">5473</td> <td align="right">-0.26 </td> <td align="right">30280</td> <td align="right">5513</td> <td width="1" align="right" bgcolor="#FFFFFF" style=""> < +/td> <td align="right">37349</td> <td align="right" style="">28286 </td> </tr> <tr valign="top" class=""> <td>Jan-06 </td> <td align="right">5620</td> <td align="right">5669</td> <td align="right">5561</td> <td align="right">5590</td> <td align="right">-0.08 </td> <td align="right">54107</td> <td align="right">5609</td> <td width="1" align="right" bgcolor="#FFFFFF" style=""> < +/td> <td align="right">55631</td> <td align="right" style="">109414 </td> </tr> ... ... ... </table>
Modified source Code:
<table width="100%" border="0" cellpadding="0" cellspacing="0" id="ti +tleTable"><tr><th>(US/Barrel)</th></tr></table> <table width="100%" border="0" cellpadding="0" cellspacing="0" id="dat +aTable" > <tr> <td rowspan="2" align="left"> <strong>Month </strong></td> <td rowspan="2" align="right"><strong>First</strong></td> <td rowspan="2" align="right"><strong>High</strong></td> <td rowspan="2" align="right"><strong>Low</strong></td> <td rowspan="2" align="right"><strong>Sett</strong></td> <td rowspan="2" align="right"><strong>Chg</strong></td> <td rowspan="2" align="right"><strong>Vol</strong></td> <td rowspan="2" align="right"><strong>BWAVE*</strong></td> <td width="1" align="center" bgcolor="#FFFFFF" style=""></td> <td align="right"><strong>Prev Vol</strong></td> <td align="right" style=""><strong>Open Int</strong></td> </tr> <tr valign="top" class="qlBgColor"> <td>Dec-05 </td> <td align="right">5525</td> <td align="right">5572</td> <td align="right">5450</td> <td align="right">5473</td> <td align="right">-0.26 </td> <td align="right">30280</td> <td align="right">5513</td> <td width="1" align="right" bgcolor="#FFFFFF" style=""></td> <td align="right">37349</td> <td align="right" style="">28286 </td> </tr> <tr valign="top" class=""> <td>Jan-06 </td> <td align="right">5620</td> <td align="right">5669</td> <td align="right">5561</td> <td align="right">5590</td> <td align="right">-0.08 </td> <td align="right">54107</td> <td align="right">5609</td> <td width="1" align="right" bgcolor="#FFFFFF" style=""></td> <td align="right">55631</td> <td align="right" style="">109414 </td> </tr> ... ... ... </table>
Results expected:
Month First High Low Sett Chg Vol + BWAVE* Vol Open Int Dec-05 5525 5572 5450 5473 -0.26 30280 551 +3 37349 28286 Jan-06 5620 5669 5561 5590 -0.08 54107 560 +9 55631 109414 Feb-06 5719 5748 5644 5670 -0.14 18911 568 +7 27073 75486 Mar-06 5770 5800 5706 5726 -0.16 6072 5741 + 9623 20557 Apr-06 5820 5833 5744 5766 -0.13 2260 5768 + 5076 12757 May-06 5818 5847 5764 5786 -0.13 936 5794 + 2724 8312
Results Obtained:
Month First High Low Sett Chg Vol BWAVE* Prev V +ol Open Int 5525 + 5572 Jan-06 5620 5669 5561 5590 -0.08 54107 5609 55631 + 109414 Feb-06 5719 5748 5644 5670 -0.14 18911 5687 27073 + 75486 Mar-06 5770 5800 5706 5726 -0.16 6072 5741 9623 + 20557 Apr-06 5820 5833 5744 5766 -0.13 2260 5768 5076 + 12757 May-06 5818 5847 5764 5786 -0.13 936 5794 2724 + 8312
As you can see, the 1st row is missing the 1st value, Dec-05, and also the values for columns 3 through 10.
On previous occassions, I have modified the html and gotten the results I wanted, so I cannot understand what is happening here. Would be grateful if someone can provide an insight.
READMORE tags added by Arunbear
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: TableExtract Module won't get complete first row!
by kulls (Hermit) on Nov 16, 2005 at 11:20 UTC | |
| A reply falls below the community's threshold of quality. You may see it by logging in. |