in reply to Harvesting and Parsing HTML from other sites
Parsing HTML using regular expressions is generally a very bad idea. You will always come across stuff that breaks your regular expressions eventually.
You are far better off using a real HTML parser. There is an HTML::Parser module on the CPAN and you'd be better off using that or one of its subclasses. It sound to me as if HTML::TreeBuilder might be just want you need in this instance.
--
"Perl makes the fun jobs fun
and the boring jobs bearable" - me
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Harvesting and Parsing HTML from other sites
by hostile17 (Novice) on Mar 28, 2001 at 14:15 UTC | |
by marius (Hermit) on Mar 28, 2001 at 21:28 UTC |