in reply to Re^2: Scraping Webpage
in thread Scraping Webpage
In my humble defense;
I wrote an entire web page that would elicit HEAD, and every other request available in the HTTP 1.0 / 1.1 spec, including downloading the entire page. This includes sanitizing INPUT, creating the form fields, and adding graphics, and CSS. I completed the entire page in under 5 minutes, and I chose LWP, and only LWP. Why? Because inspite your assertion; WWW::Mechanize adds complexity, and overhead in this scenario. His request is a bone-headed/dead-simple request, that was exactly what LWP was made for.
In fact, to complete OP's request, would have only required one additional Module; HTML::Restrict, and there are others. The Module I listed will STRIP the HTML tags of choice. Leaving the OP with an easily controlled/formatted document to display, at the OP's wishes.
I hope this provides some insight for the OP.
--Chris
#!/usr/bin/perl -Tw use Perl::Always or die; my $perl_version = (5.12.5); print $perl_version;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Scraping Webpage
by Anonymous Monk on Nov 19, 2013 at 22:10 UTC | |
by taint (Chaplain) on Nov 19, 2013 at 22:34 UTC | |
by Your Mother (Archbishop) on Nov 20, 2013 at 02:31 UTC | |
by taint (Chaplain) on Nov 20, 2013 at 04:02 UTC |