Matt,
I use pair for web hosting, and they have v.1.08, which is probably why it didn't like the tree method, right? I asked them to upgrade to the latest, and so I'll get back to it when they finish. (I figured installing a local copy was a bit silly if they already have the module installed, even an older version.) I still don't have a code sample of my regexp implementation, I'll get to it sooner or later. I'll try to give you feedback, if I can figure out what is due to my inexperience and what is not. I look forward to trying this out!
Thanks again,
Mark
| [reply] [Watch: Dir/Any] [d/l] |
Matt,
I have finally gotten TableExtract and (hopefully) all prerequisites installed in my local directory at pair networks, since they didn't seem too keen on installing them centrally for some reason. That done, I have a small snippet of the entire script that I will post here. (The rest of the script is not relevant to this process nor this discussion.) I think this should be enough to stand alone, and at least get the point across. I am getting the following message when I run the script, which tells me that something isn't installed quite right, I think.
Can't call method "tag" without a package or object reference at /usr/
+home/mllott/perl/lib/perl5/site_perl/5.8.3/HTML/ElementTable.pm line
+367.
For help, please send mail to the webmaster (mllott@pair.com), giving
+this error message and the time and date of the error.
(Nice how they tell me to contact myself when something goes wrong, I always get a good laugh out of that.) What I am doing is to get a feel for the module by trying out one of your examples on an excerpt of the HTML that I will eventually be processing. I started by commenting out everything, then un-commenting one line at a time. It's the $te->parse($html); line where things stop working.
Here is the snippet:
#!/usr/bin/perl -w
$html = html_test();
use HTML::TableExtract qw(tree);
$te = HTML::TableExtract->new( headers => [qw(DESCRIPTION STOCK_NO)]);
$te->parse($html);
# $table = $te->first_table_found;
# $table_tree = $table->tree;
# $table_tree->cell(2,2)->replace_content('Test');
# $table_html = $table_tree->as_HTML;
# $table_text = $table_tree->as_text;
# $document_tree = $te->tree;
# $html = $document_tree->as_HTML;
print "Content-type: text/html\n\n".$html;
sub html_test
{
return qq(
<html>
<head>
<title>Test Page</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1
+">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div id="Layer1" style="position:absolute; left:22px; top:24px; width:
+104px; height:29px; z-index:1"><font face="Georgia, Times New Roman,
+Times, serif"><b><font size="4">Page-131</font></b></font></div>
<table width="525" border="0" cellspacing="1">
<tr align="left" bgcolor="#0033CC">
<td bgcolor="#0033CC"> <div align="center"><font color='whit
+e'>DESCRIPTION</font></div></td>
<td bgcolor="#0033CC"> <div align="center"><font color="#FFF
+FFF">STOCK_NO</font></div></td>
</tr>
<tr bgcolor="#FFFF00">
<td width="406" align="left">1997-02 6 CYL 4.0L TJ WRANGLER<
+/td>
<td width="112"> <div align="center">AEM-218300C</div></td>
</tr>
<tr bgcolor="#FFFFFF">
<td width="406" align="left">1997-02 4 CYL 2.5L TJ WRANGLER<
+/td>
<td width="112"> <div align="center">AEM-218301C</div></td>
</tr>
<tr bgcolor="#FFFF00">
<td align="left">2000-UP 6 CYL 3.7L KJ LIBERTY</td>
<td>
<div align="center">AEM-218302C</div></td>
</tr>
<tr bgcolor="#FFFFFF">
<td height="22" align="left">1993-98 V8 (BOTH 5.2L& 5.9L
+) ZJ GRAND CHEROKEE</td>
<td><div align="center">AEM-218303C</div></td>
</tr>
</table>
</body>
</html>
);
}
(Once I am able to manipulate tables, I will be using the STOCK_NO field to grab data from a database, and add a price and shopping cart button to the table. This part will be no sweat.)
Any feedback? Thanks in advance!
Regards,
Mark | [reply] [Watch: Dir/Any] [d/l] [select] |