Re: strip header from page fetched w/ LWP

The surest way would be to use something like HTML::TreeBuilder or HTML::Parser to parse the contents and then extract everything from the <body> you're interested in.

Comment on Re: strip header from page fetched w/ LWP

Replies are listed 'Best First'.
Re: Re: strip header from page fetched w/ LWP by geektron (Curate) on Feb 09, 2004 at 17:42 UTC
well, since the markup is already there, and all i needed to do was exactly what the anon suggestion was .... your recommendation is a little over-the-top for now.	[reply]
Re: Re: Re: strip header from page fetched w/ LWP by hardburn (Abbot) on Feb 09, 2004 at 17:53 UTC
No, using an HTML parser is the correct solution. It's impossible to properly parse HTML with pure regexen (it's possible with Perl's extended regexen, but it's still messy). It's hardly over the top; coding it with a parser would probably have taken as much time as it took for you to come up with broken regex solutions. ---- `: () { :\|:& };:` Note: All code is untested, unless otherwise stated	[reply] [d/l]