heezy has asked for the wisdom of the Perl Monks concerning the following question:
Hi monks
I want to pass a treestructure of webpages (all with a common-ish formatting) and add key pieces of data within the pages to a database.
I am really overwhelmed by the number of HTML modules available and wondered if anyone has any comments on what to avoid, best practices, major pitfals, time serving perl-ish plans etc...
I want to glean three pieces of info from the webpage
<TABLE WIDTH="100%" BORDER="0" CELLSPACING="1" CELLPADDING="3"> <!-- CATEGORY --> <TR><TD CLASS="dkblue" COLSPAN="3"><A NAME="Sun Ultra 60"></A><BIG> <B>Sun Ultra 60 Documentation</B></BIG></TD></TR> <TR VALIGN="TOP" CLASS="white"> <TD>804-5884-10</TD> <TD WIDTH="90%"><B>Sun Ultra 60 Hardware AnswerBook Installation</ +B></TD> <TD><A HREF="/products-n-solutions/hardware/docs/pdf/804-5884-10.p +df" TARGET="results">pdf</A> (42KB)</TD></TR> <TR VALIGN="TOP" CLASS="lttan"> <TD>804-5886-10</TD> <TD><B>Installing the Sun Ultra 60 ShowMe How Multimedia Documenta +tion</B></TD> <TD><A HREF="/products-n-solutions/hardware/docs/pdf/804-5886-10.p +df" TARGET="results">pdf</A> (62KB)</TD></TR> <TR VALIGN="TOP" CLASS="white"> <TD>805-1709-12</TD> <TD><B>Sun Ultra 60 Service Manual</B></TD> <TD><A HREF="/products-n-solutions/hardware/docs/pdf/805-1709-12.p +df" TARGET="results">pdf</A> (6.5MB)</TD></TR> <TR VALIGN="TOP" CLASS="lttan"> <TD>805-1762-11</TD> <TD><B>Sun Ultra 60 Reference Manual</B></TD> <TD><A HREF="/products-n-solutions/hardware/docs/pdf/805-1762-11.p +df" TARGET="results">pdf</A> (344KB)</TD></TR> </TABLE>
...but obviously there is loads of other formatting on the page to be getting in my way.
Ideas on any really useful modules?
Any suggestions or tips would be helpful
Thanks monks,
m
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Hints & Tips on passing HTML?
by Ryszard (Priest) on Feb 28, 2003 at 08:00 UTC | |
Re: Hints & Tips on passing HTML?
by grantm (Parson) on Feb 28, 2003 at 08:38 UTC | |
by Hofmator (Curate) on Feb 28, 2003 at 09:43 UTC | |
Re: (nrd) Hints & Tips on passing HTML?
by newrisedesigns (Curate) on Feb 28, 2003 at 17:47 UTC | |
Re: Hints & Tips on passing HTML?
by Fletch (Bishop) on Feb 28, 2003 at 17:30 UTC | |
Re: Hints & Tips on passing HTML?
by revdiablo (Prior) on Feb 28, 2003 at 23:30 UTC | |
Re: Hints & Tips on passing HTML?
by heezy (Monk) on Feb 28, 2003 at 23:54 UTC |