As for threads, you may or may not find them useful. Perl threads do have issues (for instance, you cannot share most objects, and it may or may not be possible to use a shared socket, depending on your code).
A relatively simple and robust way of making something like this threaded is to use Thread::Queue, shove the starting urls or user names in the queue and have a few worker threads - each with their own WWW::Mechanize object - that pop the urls from the queue, parse the information and push the results on another Thread::Queue that can then be read by the "main" thread.
update: now that you've erased your original question, it's kind of hard to discuss it.
1. The big advantage with WWW::Mechanize is that it abstracts away all the cruft you don't want to think about when building web crawlers (like, how to robustly match HTML links, fill in forms, find images, etc). Most of those things are not too hard, but chances are very high you'll miss corner cases (for instance, HTML attributes may be single quoted, double quoted or unquoted, and may contain unescaped < and even > characters).
In any case, using WWW::Mechanize's forms() method gives you a much nicer interface to query the form(s) on a page.
2. If your code really doesn't need any sharing of information, you might as well use fork(). For a simpler interface you may want to check out Parallel::ForkManager.
In reply to Re: HTTP filtering and Threads...
by Joost
in thread HTTP filtering and Threads...
by danett
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |