in reply to Links for Screen Scraping
WWW::Mechanize is a subclass of LWP::UserAgent. All methods of LWP::UserAgent are available with WWW::Mechanize too, so posting works the same as with LWP::UserAgent. In most cases, the POST request is done by using the WWW::Mechanize click method, or the WWW::Mechanize submit method (if there is no button to click. In most cases, reading the WWW::Mechanize documentation should prove helpfull when looking for interesting methods.
For some more specialized scraping modules, take a look at the WWW::Search:: module namespace and the Finance::Banking:: module namespace.
I haven't found any problems in using WWW::Mechanize for all my scraping needs, together with HTML::TableExtract to scrape stuff out of HTML tables afterwards.
Unless you implement a DOM, you will have to interpret the Javascript on the pages yourself and convert the Javascript to Perl code manually.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Links for Screen Scraping
by Abigail-II (Bishop) on May 26, 2004 at 10:52 UTC |