If you just want to crawl the links on a site
and dump the pages to disk, wget is the way to go.
(regardless of if thelinks are dynamic or static)
The form issue is going to get you into trouble.
I doubt you'll find any general purpose tools for
doing what you want, because there are so many variables:
should it enter 1 set of values in every form, what are
those values? is there javascript that mutates the form
input prior to submission?
The good news is, using LWP, HTML::Tree and HTML::Element It's REALLY easy to:
- Download a page
- scrape all of the links from that page and remember them
- check for any forms on that page
- get a list of all the form elements in that form
- prompt the user how to fill our hte form and/or look up in some datastructure what to do with forms/elements that have those names.
| [reply] |
Thanks a lot
I didn't know the existence of these modules, so you really saved me a lot of time and work.
If I come up with a general-use tool, I will post it in PerlMonks.
Once again thanks.
| [reply] |
I don't know of any easy-peasy perl hacks to do this (ala wget), but I think that LWP::UserAgent definitely has the functionality you'll need to accomplish the task. Good Luck!
-fuzzyping | [reply] |