Form filling will be tricky with wget. But the rest is possible.
With wget, you're limited to whatever processing you can do by pipeing output to another program. With LWP, you have all of Perl available to work on the data. In many situations, wget is enough, but LWP is there when it's not.
"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.
| [reply] |
I use whatever is fit for the job. If I want simplicity, I use LWP::Simple. If I need to fill out forms, I use WWW::Mechanize. If I need to connect to a busy server or through a flaky network and I need retries, I use wget. If I need something, and LWP isn't available, lynx might do the job as well. Or ftp, or ncftp. If it's being done using FTP, and I need to do something more fancy than retrieving a document, I use Net::FTP. If I want to recursively download something, or continue a partially downloaded file, I use wget. If I want to retrieve something, and display it immediately, system "mozilla URL" might do the trick, or I use the remote control functionality of a running browser. If I need to be really fancy, I use LWP::UA. And for debugging, I might use "telnet host 80" from the command line.
I've used all of the methods I mentioned above. As with most programming techniques, it's a matter of finding the right trade-off between simplicity of the interface, your needs, your knowledge/experience of the tool, offered functionality and availability. It's a fallacy to think one tool is "better" than the other. A carpenter isn't going to say "I've a hammer and a screwdriver - why have both?" either. | [reply] |
Also check out Curl and libCurl There is a Perl module too, called WWW-Curl-2.0
It is very powerful.
I'm not really a human, but I play one on earth.
flash japh
| [reply] |
Most of the time when I have to snarf a web page, I have to extract some data from it afterwards. I think it's *way* easier to do with the tools in perl than doing a
cat | sed | awk | sort | sed | diff | sed | sed | awk | sed
chain. In perl, I can assign the $response to a variable, walk through it, strip the html, the tabular data, verify it against what I've expected, and stuff it into a db - all in one program. AND I can check against any errors occuring in any of those steps.
I've known people who spend all their time in sed/awk and can whip up scripts to do everything there - and I'm sure people can do it in emacs and make and C. I choose perl. Whatever works for you. | [reply] |
True, but $response = `wget -O- URL`; is shorter than use LWP::Simple; $response = get "URL";, while still enabling you to use the full power of Perl to parse it.
| [reply] [d/l] [select] |